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1. Introduction and statement of results 


The prime numbers which are of the form a 2 + b 2 are characterized in a 
beautiful theorem of Fermat. It is easy to see that no prime p = 4n— 1 can be so 
written and Fermat proved that allp = 4n+l can be. Today we know that for a 
general binary quadratic form <f>(a. b ) = aa? + fdab+^b 2 which is irreducible the 
primes represented are characterized by congruence and class group conditions. 
Therefore (/> represents a positive density of primes provided it satisfies a few 
local conditions. In fact a general quadratic irreducible polynomial in two 
variables is known [Iw] to represent the expected order of primes (these are not 
characterized in any simple fashion). Polynomials in one variable are naturally 
more difficult and only the case of linear polynomials is settled, due to Dirichlet. 

In this paper we prove that there are infinitely many primes of the form 
a 2 + 6, in fact getting the asymptotic formula. Our main result is 


Theorem 1. We have 


(1.1) A(a 2 + 6 4 ) = 47T 1 kx 4 

a 2 +fc 4 ^ir 

where a, b run over positive integers and 

(1.2) n= [ (l — t 4 ) 2 dt = 

Jo 


jl + O 


/ loglog x \ ) 

V // 


r (|) 2 /6\/2^ . 


Here of course, A denotes the von Mangoldt function and T the Euler 
gamma function. The factor 4/7T is meaningful; it comes from the product 
(2.17) which in our case is computed in (4.8). Also the elliptic integral (1.2) 
arises naturally from the counting (with multiplicity included) of the integers 
n ^ x, n = a 2 + b 4 (see (3.15) and take d = 1). In view of these computations 
one can interpret 4/7rlogx as the “probability” of such an integer being prime. 
By comparing (1.1) with the asymptotic formula in the case of a 2 + b 2 (change 
to x and t 4 to f 2 ), we see that the probability of an integer a 2 + b 2 being 
prime is the same when we are told that 6 is a square as it is when we are told 
that b is not a square. In contrast to the examples given above which involved 
sets of primes of order x(logx) -1 and .T(log.x) -3 / 2 , the one given here is much 
thinner. 

Our work was inspired by results of Fouvry and Iwaniec [FI] wherein they 
proved the asymptotic formula 

( L3 ) Y2Y2 A (“ 2 + A ( fc ) = crx {1 + O ((log x)~ A ) } 

a 2 +b 2 ^x 

with a positive constant a which gives the primes of the form a 2 + b 2 with b 
prime. 
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Theorem 1 admits a number of refinements. It follows immediately from 
our proof that the expected asymptotic formula holds when the variables a, b 
are restricted to any fixed arithmetic progressions, and moreover that the dis¬ 
tribution of such points is uniform within any non-pathological planar domain. 
We expect, but did not check, that the methods carry over to the prime val¬ 
ues of <fr(a,b 2 ) for 4> a quite general binary quadratic form. The method fails 
however to produce primes of the type 4>(a, b 2 ) where cj) is a non-homogeneous 
quadratic polynomial. 

One may look at the equation 

(1.4) p = a 2 + 6 4 

in two different ways. First, starting from the sequence of Fermat primes 
p = a 2 + b 2 one may try to select those for which b is square. We take the 
alternative approach of beginning with the integers 

(1.5) n = a 2 + b 4 

and using the sieve to select primes. In the first case one would begin with 
a rather dense set but would then have to select a very thin subset. In our 
approach we begin with a very thin set but one which is sufficiently regular in 
behaviour for us to detect primes. 

In its classical format the sieve is unable to detect primes for a very intrin¬ 
sic reason, first pointed out by Selberg [Se] and known as the parity problem. 
The asymptotic sieve of Bombieri [Bol], [FI1] clearly exhibits this problem. We 
base our proof on a new version of the sieve [FI3], which should be regarded 
as a development of Bombieri’s sieve and was designed specifically to break 
this barrier and to simultaneously treat thinner sets of primes. This paper, 
[FI3], represents an indispensible part of the proof of Theorem 1. Originally 
we had intended to include it within the current paper but, expecting it to 
trigger other applications, we have split it off. Here, in Section 2, we briefly 
summarize the necessary results from that paper. 

Any sieve requires good estimates for the remainder term in counting the 
numbers (1.5) divisible by a given integer d. Such an estimate is required also 
by our sieve and for our problem a best possible estimate of this type was pro¬ 
vided in [FI] as a subtle deduction from the Davenport-Halberstam Theorem 
[DH], It was this particular result of [FI] which most directly motivated the 
current work. In Section 3 we give, for completeness, that part of their work 
in a form immediately applicable to our problem. We also briefly describe 
at the end of that section how the other standard sieve assumptions listed in 
Section 2 follow easily for the particular sequence considered here. 

In departing from the classical sieve, we introduce (see (2.11)) an addi¬ 
tional axiom which overcomes the parity problem. As a result of this we are 
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now required also to verify estimates for sums of the type 
(L6) ^2 | ^2/3(m)ae m \ 

i m 

where f3(m ) is very much like the Mobius function and a n is the number of 
representations of (1.5) for given n = £m. The estimates for these bilinear 
forms constitute the major part of the paper and several of them are of interest 
on their own. 

For example we describe an interesting by-product of one part of this 
work. Given a Fermat prime p we define its spin a p to be the Jacobi symbol 
(^) where p = r 2 + s 2 is the unique representation in positive integers with 
r odd. We show the equidistribution of the positive and negative spins a p . 
Actually we obtain this in a strong form, specifically: 


Theorem 2. 


(1.7) 


We have 


EE 

r 2 -\-s 2 =p^.x 


76 

< £77 


where r, s run over positive integers with r 


odd and (^) is the Jacobi symbol. 


Remarks. The primes in (1.7) are not directly related to those in (1.4). 
As in the case of Theorem 1 the bound (1.7) holds without change when 7 r = 
r + is restricted to a fixed sector and in any fixed arithmetic progression. The 
exponent || can be reduced by refining our estimates for the relevant bilinear 
forms (see Theorem 2^ in Section 26 for a more general statement and further 
remarks). 


In studying bilinear forms of type (1.6) we are led, following some prelim¬ 
inary technical reductions in Sections 4 and 5, to the lattice point problem of 
counting points in an arithmetic progression inside the “biquadratic ellipse” 

( 1 . 8 ) bf — 2^fb\b\ + < x 

for a parameter 0 < 7 < 1. The counting is accomplished in Sections 6-9 by 
a rather delicate harmonic analysis necessitated by the degree of uniformity 
required. The modulus A of the progression is very large and there are not 
many lattice points compared to the area of the region, at least for a given value 
of the parameter. It is in this counting that we exploit the great regularity in 
the distribution of the squares and after this step the problem of the thinness 
of the sifting set is gone. 

There remains the task of summing the resulting main terms, that is those 
coming from the zero frequency in the harmonic analysis, over the relevant 
values of the parameter 7 . The structure of these main terms is arithmetic in 
nature and there is some cancellation to be found in their sum, albeit requiring 
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for its detection techniques more subtle than those needed for the nonzero 
frequencies. This sum is given by a bilinear form (not to be confused with 
(1.6)) which involves roots of quadratic congruences, again to modulus A, 
which are then, as is familiar, expressed in terms of the Jacobi symbol and 
arithmetic progressions, this time with moduli d running through the divisors 
of A. Decomposing in Section 10 the relevant sum in accordance with the size 
of the divisors d we find that we need very different techniques to deal with 
the divisors in different ranges. 

For all but the smallest and largest ranges the relevant sum may be treated 
by rather general mean-value theorems of Barban-Davenport-Halberstam type. 
That is we need to estimate Jacobi-twisted sums on average over all residue 
classes and their moduli. Although, as in other theorems of this type, the 
results pertain for linear forms with very general coefficients, because of the 
rather hybrid nature of our sum (the real characters over progressions are 
mixed with the multiplicative inverse) new ideas are required. The goal is 
achieved in three steps; see Sections 11, 12, 13, their combination in Section 14 
and application in Section 15. 

In Section 16 we treat the smallest moduli. We require what is in essence 
an equidistribution result on Gaussian primes in sectors and residue classes. 
Now the shape of our coefficients is crucial; the cancellation will come from 
their resemblance to the Mobius function. The machinery for this result was 
developed by Hecke [He]. However, greater uniformity in the conductor is 
required than could have been done by him at a time prior to the famous 
estimate of Siegel [Si]. Siegel’s work deals with L-functions of real Dirichlet 
characters rather than Grossencharacters, but today it is a routine matter 
to extend his argument to our case. Here we employ an elegant argument 
of Goldfeld [Go]. This analogue of the Siegel-Walhsz bound is applied to 
our problem as in the original framework and the implied constants are not 
computable. 

There remains only the treatment of the largest moduli. We regard this as 
perhaps the most interesting part and hence we save it for last. In Section 17 
we make some preliminary reductions and state our final goal, Proposition 17.2, 
for these sums. In Section 18 we show how this proposition, when combined 
with our earlier results, completes the proof of the main Theorem 1. 

It has been familiar since the time of Dirichlet that, in dealing with various 
ranges in a divisor problem it is often profitable to replace large divisors by 
smaller ones by means of the involution d —► |A|/d. Already this was required 
here in Section 12 for the final two steps in the treatment of the mid-sized 
moduli. An interesting feature in our case is that, due to the presence of the 
Jacobi symbol, the law of quadratic reciprocity plays a crucial role in this 
involution and an extra Jacobi symbol (of the type occurring in Theorem 2) 
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emerges in the transformed sum (see Lemma 17.1). This extra symbol (see 
(20.1)) is essentially treated as a function of one complex variable and as such 
it is reminiscent of the Kubota symbol. This “Jacobi-Kubota symbol” later 
creates in Section 23, by summation over all Gaussian integers of given norm, a 
function on the positive integers to which we refer as a “quadratic eigenvalue”. 

Because the mean-value theorems of Sections 11-13 hold for such general 
coefficients the appearance of the Jacobi-Kubota symbol does not affect the 
arguments of those sections so we are able to cover completely the range of 
mid-sized moduli. When we again apply the Dirichlet involution, this time to 
transform the largest moduli, we now arrive in the same range of small moduli 
which have just been treated in Section 16. Now however the presence of the 
Jacobi-Kubota symbol destroys the previous argument, that is the theory of 
Hecke L-functions is not applicable here. 

In the solution of this final part of our problem a prominent role is played 
by the real characters in the Gaussian domain. Dirichlet [Di] was first to 
treat these as an extension of the Legendre symbol. In this paper we require 
this Dirichlet symbol for all primary numbers, not just primes, in the same 
way the Jacobi symbol generalizes that of Legendre. These are introduced in 
Section 19. They enter our study via a kind of theta multiplier rule for the 
multiplication of the Jacobi-Kubota symbol, a rule we establish in Section 20. 

Of particular interest are the results of Sections 21-22 concerned with 
general bilinear forms with the Dirichlet symbol and special linear forms with 
the Jacobi-Kubota symbol. This time a cancellation is received from the sign 
changes of these symbols rather than from those of the Mobius function which 
also makes an appearance arising as coefficients from our particular sieve the¬ 
ory. Originally, in the estimation of both of the above forms we used the 
Burgess bound for short character sums (thus appealing indirectly to the Rie- 
mann Hypothesis for curves, that is the Hasse-Weil Theorem). This allowed us 
to obtain results which in some cases are stronger than those presented here. 
After several attempts to simplify the original arguments we ended up with 
the current treatment for bilinear forms producing satisfactory results in wider 
ranges. Because of the wider ranges in the bilinear forms we were able to accept 
linear form estimates which are less uniform in the involved parameters, and 
consequently were able to dispense with the Burgess bound, replacing it (see 
Section 22) with the more elementary Polya-Vinogradov inequality. Should 
we have combined the original and the present arguments then a substantial 
quantitative sharpening of Theorem 2 would follow. 

Our estimates for the bilinear form with the Dirichlet symbol and for the 
special linear form with the Jacobi-Kubota symbol are then in Section 23, 
via the multiplier rule, transformed into corresponding results for forms in 
quadratic eigenvalues. 
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Our final job is to transform (in Sections 25 and 26) these linear and 
bilinear forms in the quadratic eigenvalues into sums supported on the primes 
(which completes Theorem 2) or sums weighted by Mobius type functions 
(which completes Proposition 17.2 and hence Theorem 1). There are by now 
a number of known combinatorial identities which can be used to achieve such 
a goal. The identity we introduce in Section 24 has some novel features. In 
particular, it enables us to reduce rather quickly from Mobius-type functions 
to primes and hence allows us to achieve two goals at once. 

The statement of Theorem 1 may be re-interpreted in terms of the elliptic 
curve 

(1.9) E : y 2 = x 3 — x . 

This curve, the congruent number curve, has complex multiplication by Z[i] 
and the corresponding Hasse-Weil L-function 

OO 

Le(s ) = ^ X n n~ s 

n= 1 

is the Mellin transform of a theta series 

OO 

f ( z ) = ^ A n e(nz) 

n= 1 

which is a cusp form of weight two on To (32) and is an eigenfunction of all the 
Hecke operators T p f = A p f . Precisely, the eigenvalues are given by 

A n = y, w 

ww=n 

where a restricts the summation to w = 1 (mod 2(1 + i)), that is w is primary. 
Hence X p = it + W if p = tttt with it primary. In particular, if p = a 2 + b 4 , with 
4 | o, then 7r = b 2 + ia is primary so that X p = 2b 2 . Thus Theorem 1 gives 
the asymptotic formula for primes for which the Hecke eigenvalue is twice a 
square. Using Jacobsthal sums for these primes one expresses this property as 

- £ 

0<x<p/2 

The primes of type p = a 2 + b 4 give points of infinite order on the quartic 
twists 

E p : y 2 = x 3 - px , 

namely ( x,y ) = (—b 2 ,ab). That this is not a torsion point follows from the 
Lutz-Nagell criterion. We thank Andrew Granville for pointing this out to 
us. The parity conjecture asserts in this case that the rank of E p is odd if 
a = 0(mod4) and even if a = 2(nrod4). Recent results concerning points on 


x 3 — x 


= square. 
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quartic twists have been established by Stewart and Top [ST] improving and 
generalizing earlier work of Gouvea and Mazur [GM], 

Further interesting connections to elliptic curves hold for primes of the 
form 27a 2 + 4 b 6 and there is some hope to produce such primes using our 
arguments in the domain Z^]. 

The results of this paper have been announced together with a very brief 
sketch of the main ideas of the proof in the paper [FI2] in the Proceedings of the 
National Academy of Sciences of the USA. We close here by repeating the last 
sentences of that paper: “Although the proofs of our results are rather lengthy 
and complicated we are able to avoid much of the high-powered technology 
frequently used in modern analytic number theory such as the bounds of Weil 
and Deligne. We also do not appeal to the theory of automorphic functions 
although experts will, in several places, detect it bubbling just beneath the 
surface.” 

Acknowledgements. We thank the Institute for Advanced Study for pro¬ 
viding us with excellent conditions during the early stages of this work begin¬ 
ning in December 1995. HI thanks the University of Toronto for their hospi¬ 
tality during several short visits. We also enjoyed the hospitality of Carleton 
University during the CNTA conference in August 1996. We thank E. Fou- 
vry for his encouragement. Finally we thank the referee as well as E. Fouvry, 
A. Granville, D. Shiu, and especially M. Watkins, for pointing out a number 
of minor inaccuracies. 

2. Asymptotic sieve for primes 

In this section we state a result of [FI3] in a form which is suitable for the 
proof of the main theorem. Let A = (a n ) be a sequence of real, nonnegative 
numbers for n = 1,2, 3,... Our objective is an asymptotic formula for 

S(x) = ^2 a p log p 

p^x 

subject to various hypotheses familiar from sieve theory. 

Let x be a given number, sufficiently large in terms of A. Put 

A{x) = y~]a ra • 

n^.x 

We assume the crude bounds 

(2.1) A(x) » (loga;) 2 , 

A{x) > a£\ 


( 2 . 2 ) 


1 

2 



THE POLYNOMIAL X 2 + Y 4 CAPTURES ITS PRIMES 


953 


For any d ^ 1 we write 

(2.3) A d (x ) = ^ a n = g{d)A(x ) + r d ( x) 

n^x 

n=0(mod d) 

where g is a nice multiplicative function and r d {x) may be regarded as an error 
term which is small on average. These must of course be made more specific. 
We assume that g has the following properties: 

(2.4) 0 < g(p 2 ) < g(p) < 1 , 

(2.5) g(p ) < p -1 , 
and 

(2.6) g(p 2 ) < p~ 2 . 

Furthermore, for all y ^ 2, 

(2.7) J^£f(p) =loglogy + c + 0((logy) _10 ) , 
p^y 

where c is a constant depending only on g. 

We assume another crude bound 

( 2 . 8 ) A d (x) <C d~ 1 r(d) 8 A(x) uniformly in d < xs . 

We assume that the error terms satisfy 

(2.9) \r d (t) | < A(x)L~ 2 

d^DL 2 

uniformly in t < x, for some D in the range 

(2.10) xa < D < x . 

Here the superscript 3 in (2.9) restricts the summation to cubefree moduli and 
L = (logx) 224 . 

We require an estimate for bilinear forms of the type 
P-ii) £ I £ (3(n)amn | < A(x){ logx) 2-6 

m N<n^2N 
mn^-x 
(n,mll)=l 

where the coefficients are given by 

(2.12) (3{n) = (3(n, C) = p(n) ^ p(c) . 

c\n,c^.C 

This is required for every C with 

(2.13) 1 < C < xD~ l , 
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and for every N with 

(2.14) A _1 /D < N < , 

for some A > 6 > 2. Here n is the product of all primes p < P with P which 
can be chosen at will subject to 

(2.15) 2 < P sj A 1 / 235 10 ^ 08 * . 

Proposition 2.1. Assuming the above hypotheses , we have 

(2.16) S(x) = iMM{l + 0(d|£)} 

where H is the positive constant given by the convergent product 
(2-17) H = , 

P 

and the implied constant depends only on the function g. 

In practice 5 is a large power of log x and A is a small power of x. For most 
sequences all of the above hypotheses are easy to verify with the exception of 
(2.9) and (2.11). The hypothesis (2.9) is a traditional one while (2.11) is quite 
new in sieve theory. 

We conclude this section by giving some technical results on the divisor 
function which will find repeated application in this paper. 

Lemma 2.2. Fix k > 2. Any n ^ 1 has a divisor d < n 1/,fc such that 

r(n) < (2r(d))^ , 

and, in case n is squarefree, then we may strengthen this to r(n) < (2 r(d)) k . 
For any n ^ 1 we also have 

T(n) <9 r ( d ) • 

d\n,d^.n^ 


The first two of these three statements are also from [FI3] (see Lemmata 
1 and 2 there for the proofs). To prove the last of these we note that 

Mn) ^3 r (^) ’ 

d\n,dart's 

and hence by Cauchy’s inequality 

t(n) = T 3 (n) 2 Q2T{^) 2 T(dy 1 y 1 ^ 9 r (°0 • 

d l n d\n,d^n? 
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On the other hand we have t(n) ^ r(ra) which, due to multiplicativity, can be 
checked by verifying on prime powers. This completes the proof of the lemma. 

3. The sieve remainder term 

In this section we verify the hypothesis (2.9) by arguments of [FI]. Given 
an arithmetic function j : Z —> C we consider the sequence A = (a n ) : N —» C 
with 

(3.1) a n = YU 3 ( 6 ) 

a 2 +6 2 =n 

where a and b are integers, not necessarily positive. In our particular sequence 
3 will be supported on squares. Note that this use of a, b changes from now on 
that in the introduction. We have 

A d{x) = Y an= 3 ( 6 ) • 

0<n^x o <a 2 +b 2 ^x 

n=0(mod d) a 2 +b 2 =0{modd) 

We expect that A d (x ) is well approximated by 

M d {x) = i YY1 3 (b)p(b;d) 

0<a 2 +b 2 ^x 

where p(b;d) denotes the number of solutions a (mod d) to the congruence 

a 2 + b 2 = 0 (mod d) . 

For b = 1 we denote p(l]d) = p(d ); it is the multiplicative function such that 

P(p a ) = 1 + X4 (p) 

except that p(2 a ) = 0 if a ^ 2. Here X4 is the character of conductor four. 
Thus if 4 \d 


p{d) = I| + X4(p)) = 

p\d is\d 

and p(d) = 0 if 4|d. The notation indicates a summation over squarefree 
integers. For any b we have 

(3.2) p(b; d) = (6, d 2 )p (d/(6 2 , d)) 

where d 2 denotes the largest square divisor of d, that is d = d\d\ with d\ 
squarefree. 
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Lemma 3.1. Suppose 3 {b) is supported on squares and |3(b)| < 2. Then 
(3.3) ^2\ A d{x) ~ M d (x)\ < D*xTe +£ 

d^D 

for any D ^ 1 and £ > 0, the implied constant depending only on e. 

Remarks. This result is a modification of Lemma 4 of [FI] for our partic¬ 
ular sequence A = ( a n ) supported on numbers n = a 2 + c 4 . Of course, in [FI] 
the authors had no reason to consider such a thin sequence so their version did 
not take advantage of the lacunarity of the squares. 


In our case we have the individual bounds 

(3.4) XI - 4 d{x) « £4 (logx) 2 , 

d^x 

(3.5) XI Md(x ) <C x4(logx) 2 . 

d^.x 

These are derived as follows: 

^2 A d(x)^ I3( ft )k(a 2 +b 2 ) 

d 0 <a 2 +b 2 ^x 

< 16\/x E 13(01 E p{b ; d)d 1 . 

0 ^b^y/x d^.y/x 

To estimate the inner sum we use the bounds p(b; d) < d 2 p(d) < p(di)p(dz)d 2 , 
for d odd, p(b ; d) < 4 y/d for d a power of 2, and 

'^2p(d)d~ 1 <C log a;. 

d^x 

Hence we obtain (3.4) while (3.5) is derived similarly. In view of (3.4) and 

(3.5) our estimate (3.3) is trivial if D > x 3 / 4 , as expected. Therefore we can 
assume that D < x 3 / 4 . 

The proof of Lemma 3.1 requires an application of harmonic analysis and 
it rests on the fact that there is an exceptional well-spacing property of the 
rationals v/d (mod 1) with v ranging over the roots of 

v 2 + 1 = 0 (mod d) . 

These roots correspond to the primitive representations of the modulus as the 
sum of two squares 

d = r 2 + s 2 with (r, s) = 1 . 

By choosing — s < r < s we see that each such representation gives the unique 
root defined by ns = r (mod d ). Hence 


v 

d 


r r ( A 1 \ 

—- (mod 1 ) 

sd s 
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where f denotes the multiplicative inverse of r modulo s, that is rr = l(mod s). 
Here the fraction r/s has much smaller denominator than that of v/d whereas 
the other term is small, namely 

|r| 1 

sd < 2s 2 

except in the case r = s = 1 where equality holds. Therefore the points v/d 
behave as if they repel each other and are distanced considerably further apart 
than would appear at first glance. Precisely, if v\/d\ and v 2j /d 2 are distinct 
with n and r 2 having the same sign and | < < | then 


v _1 

di 


V2 

d 2 


1 [l 1 \ 1 1 

sis 2 \2si 2 s\) 4sis 2 k\Jd\d 2 


Thus if the moduli are confined to an interval | D < d ^ D then the points 
v/d are spaced by 1/4 D rather than 1 / D 2 . Applying the large sieve inequality 
of Davenport-Halberstam [DH] for these points we derive 


Lemma 3.2. For any complex numbers a n we have 

£ . I ^ anG (t) I 2 < +Ar )H a ii 2 

D<d^'2D iy 2 +l=0(mod d) n^N 

where ||a|| denotes the l 2 -norm of a = ( a n ) and the implied constant is abso¬ 
lute. 


By Cauchy’s inequality Lemma 3.2 yields 

( 3 - 6 ) I ane {~j) I < D *( D +a o^imi • 

d^D u 2 +l=0(d) n^N 

From this we shall derive a bound for general linear forms in the arithmetic 
functions 

(3.7) p(k,£;d) = e(vk/d ) . 

^ 2 +£ 2 =0(mod d) 

Lemma 3.3. For any complex numbers f(k,£) we have 

ElEE C(M)/»(M;d) | < (■D + VDKL ){DKLf ||£|| 

d^D 0<k^K 
0 

where ||£|| denotes the l 2 -norm of £ = (£(k,£))\ that is 

iif 11 2 = EE !'«*•<) I 2 . 

0<k^K 
0 <e^L 

and the implied constant depends only on e. 
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The functions p(k, £; d ) serve as “Weyl harmonics” for the equidistribution 
of roots of the congruence 

(3.8) v 2 + £ 2 = 0 (rnodci) . 

Note that p(0, £\ d) = p(£; d) is the multiplicative function which appears in 
the expected main term M d (x) and this is expressed simply in terms of p{d) 
by (3.2). If k ^ 0 then p{k,£\d) is more involved but one can at least reduce 
this to the case £ = 1. Specifically, letting (d,£ 2 ) = yd 2 with 7 squarefree so 
d = 'yS 2 d', £ = 7 <%", one shows that 

(3.9) p{k,£-d) =6p(k'£',l]d') 

provided that k = 8k' is a multiple of 5, while p(k,£;d ) vanishes if k is not 
divisible by 5. By this we obtain 

Y I YY,t( k ’ £ ') p ( k ’ £ ’ d ') I 

d^.D 0 <k^.K 
0 <£^L 


7 S 2 d^.D 0 <k^K/8 

0<^L/7<5 

(£,d )=1 

Ignoring the condition (£, d) = 1 we would get the bound of Lemma 3.3 by 
applying (3.6) directly. However this co-primality condition can be inserted at 
no extra cost by Mobius inversion and this completes the proof of Lemma 3.3. 

Now we are ready to prove Lemma 3.1. We begin by smoothing the sum 
A d(x) with a function f(u) supported on [0, x] such that 

/(it) = 1 ifO < u ^ x — y , 

f^\u) <C if x — y < u < x , 

where y will be chosen later subject to 12 < y < x and the implied constant 
depends only on j. Our intention is to apply Fourier analysis to the sum 

A d {f) = a nf{n ) 

n=0(mod d) 

rather than directly to A ( i{x). By a trivial estimation the difference is 
(3.10) ^2 | A d (x) - A d (f) | < yx~^ +£ . 

d^D 

In A d (f) we split the summation over a into classes modulo d getting 

A d {f)=Y$( b ) Y Y ./V+fr 2 )- 

b a 2 +b 2 =0(d) a=a(d) 
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It is convenient to first remove the contribution coming from terms with 6 = 0, 
since these are not covered by Lemma 3.3. This contribution is 

3(0) 22 /(° 2 ) =3(0)^/((adid 2 ) 2 ) < 

a 2 =0(d) a 

For the nonzero values of 6 we expand the above inner sum into Fourier series 
by Poisson’s formula getting 


di d• 


1«2 


1 


E /(« 2 + * 2 ) = i E 


at ) fV + Oef* 


d ) J-c 


\d 


dt . 


a=a(d) k 

Hence the smooth sum A d (f) has the expansion 

(3.11) Mf) = \ E W 22 P(k b; d)I(k, b;d) + o( ±2) 

a b^O k V 1 2 / 

where I(k, 6 ; d) is the Fourier integral 

/* OO 

I(k,b;d) = / f(t 2 + b 2 )cos(2irtk/d)dt. 

Jo 

The main term comes from k = 0 which gives 

mm) = ly i mp(b-,d.)i(o,b-,i) . 

Since in this case the integral approximates to the sum, precisely 

2 /( 0 , 6; d) = 22 1 + 0(y(x + y-b 2 )-^), 

a 2 +b 2 ^ x 

the difference between the expected main terms satisfies 

M d (f) ~M d (x) ^-,22 P( c2 '’ d )(x + y ~ c 4 )“T 


C 4 < X 


Summing over moduli we first derive by the same arguments which led us to 
(3.5) that 

22 d ~ 1 p{c 2 ]d) < (log 2D) 2 , 


d<D 


and then summing over c we arrive at 

(3.12) 22 I M d(/) - M d (x) | < yx“ 3 (log re) 2 . 

d^D 

For positive frequencies k we shall estimate I(k,b]d ) = I(—k,b;d) by 
repeated partial integration. We have 

|)/(t 2 + 6 2 )= e , 
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with some positive constants Cij , whence 

I{k,b'd) < 

for any j ^ 0. This shows that I(k,b]d ) is very small if k > K = Dy^ l x^ +e 
by choosing j = j(e) sufficiently large. Estimating the tail of the Fourier series 
(3.11) trivially we are left with 


p(k, b ; d)I{k , 6; d) + O 

6^0 



To separate the modulus d from k,b in the Fourier integral we write 


I(k , 6 ; d) 


1 



+ 6 2 ) cos(2irt\/x/d)dt 


by changing the variable t. into t^/x/k. Note that the new variable lies in the 
range 0 < t < k. Hence | A d (f) — M d (f) I is bounded by 


r K 
d Jo 


EE 3 ~ff 

0 <b^.y/x 
t<k^.K 


( xt 2 


+ b 2 ) p(k, 6; d) dt + O 


d\d2 


Recall that 3 {b) is supported on squares; b = c 2 with |c| < C = x*. Applying 
Lemma 3.3 to the relevant triple sum and then integrating over 0 < t < K we 
obtain 


Y, d I Mf) - M d (f) | « \fx {jD + cVDK)(CK) l 2 +e 


Hence the smooth remainder satisfies 

(3.13) ]T | A d (f) - M d (f) | « Dhy-'x^+t. 

dfiD 


Finally, on combining (3.10), (3.12) and (3.13) we obtain (3.3) by the choice 

1 13 

y = D 4xi6. 

From now on 3(fr) is equal to 2 if b = c 2 > 0, 3(0) = 1, and 3 {b) = 0 
otherwise. In other words 


(3.14) 3 (b) = J2 1 

c 2 =b 

where c is any integer. Note that 3(&) is the Fourier coefficient of the classical 
theta function. For this choice of 3 we shall evaluate the main term M d (x ) 
more precisely. 
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Lemma 3.4. For d cubefree we have 

(3.15) M c j(x) = g(d)nxi + O [h{d)x^j 

where k is the constant given by the elliptic integral (1.2) and g(d), h(d) are 
the multiplicative functions given by 

(3.16) g(p)p = 1 + X4(p)(l - j), g{p 2 )p 2 = 1 + p(p) (l - j), 

h{p)p = 1 + 2 p{p) , h(p 2 )p 2 =p+ 2 p{p) , 


except that g (4) = 

Proof. We have 

M d(x) = 2 Y p(c 2 ;d) {(x- c 4 ) 2 + 0 ( 1 )} • 
|c|<a:* 


Since d is cubefree we can write d = d\d\ with dp]^ squarefree, so that we have 
p(£ 2 ;d) = (£, dz)p (did 2 /(i, diofo)) except for d 2 even and t odd, in which case 
p(£ 2 ', d) = 0. Hence, for d not divisible by 4 we have 


M d (x) 


vi\di 

V2\d.2 


( d\d2 

\VlV2 


Y {(x-c 4 ) 2 + 0 ( 1 )} 
1 

( C 1 dld2) = l'l V2 


2 

d 


Y V2 P 

vi\di 
V2\d2 


( d\d2 
V Pi P 2 



+£i + 0 (r(hffe) 

did 2 V V v \ u 2 ) 



This formula gives (3.15) with 



which completes the proof of Lemma 3.4 in this case. For d cubefree and 
divisible by 4 the above argument goes through except that, as noted, p(£ 2 , d) 
= 0 for i odd. This implies that, in the summation, c and hence V 2 must be 
restricted to even numbers. This makes the value of g{ 4) exceptional. □ 


We define the error term 


(3.17) 


r d (x) = A d (x) - g(d)A{x) . 
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By Lemma 3.4 for d = 1 we get 

(3.18) A(x) = 4 nx* + 0 ; 

thus for d cubefree the error term satisfies 

r d (x) = A d (x) - M d (x ) + 0 [h{d)x^j . 

Note that 

Y h W < II ^ + ^(P)) ( X + h (P 2 )) < ( lo g®) 4 > 

where the superscript 3 restricts to cubefree numbers. This together with 
Lemma 3.1 implies 

Proposition 3.5. We have for all t < x, 

(3.19) Y, \ r d(t )I « D*x^e +£ . 

d^D 

The restriction to cubefree moduli in (3.19) is not necessary but it is 
sufficient for our needs. The fact that we are able to make this restriction will 
be technically convenient in a number of places specifically because cubefree 
numbers d possess the property that they can be decomposed as d = d\d\ with 
d\, c ?2 squarefree and (d \, d/ 2 ) = 1. 

Proposition 3.5 verifies one of the two major hypotheses of the ASP 
(Asymptotic Sieve for Primes), namely (2.9) with D = x^~° £ by a comfort¬ 
able margin and indeed is, apart from the e, the best that one can hope for. 
The hypotheses (2.4), (2.5), and (2.6) are easily verified by an examination of 
(3.16). The asymptotic formula (2.7) is derived from the Prime Number The¬ 
orem for the primes in residue classes modulo four. Next, the crude bounds 

(2.1) , (2.2) and (2.8) are obvious in our case. More precisely, one can derive 
by elementary arguments that A d (x) <C d~ 1 r(d)A(x) uniformly for d < x^~ £ 
in place of (2.8). Therefore we are left with the problem of establishing the 
second major hypothesis of the ASP, namely the bilinear form bound (2.11). 

4. The bilinear form in the sieve: Refinements 

Throughout a n denotes the number of integral solutions a, c to 

(4.1) a 2 +c 4 = n. 

Recall from the previous section that (see (3.18)) 

(4.2) A(x) = Yj a n = 4kx2 + O(xz) . 

n^.x 
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In this section we give a preliminary analysis of the bilinear forms 

(4.3) B(x;N) = El E (3(n) 

&mn | 

m N<n^2N 
mn^.x 

(n,mn)=l 

with coefficients (3(n) given by (2.12) and II the product of primes p < P with 
P in the range 

(4.4) (log log x) 2 < logP < (log x) (log log x)~ 2 . 

Although the sieve does not require any lower bound for P, that is II = 1 is 
permissible, we introduce this as a technical device which greatly simplifies a 
large number of computations. With slightly more work we could relax the 
lower bound for P to a suitably large power of log x and still obtain the same 
results. 

Note the bound 

B(x;N ) <C A(.x)(logic) 4 

uniformly in N < xb This follows from (3.1) by a trivial estimation, but we 
need the stronger bound (2.11). We shall establish the following improvement: 

Proposition 4.1. Let rj > 0 and A > 0. Then we have 

(4.5) B(x;N) <C A(x)(logx) 4 ~ A 
for every N with 

(4.6) x4 +v < N < x^{\ogx)~ B 

and the coefficients / 3(n,C ) given by (2.12) with 1 < C < N 1 ~ v . Here B and 
the implied constant in (4.5) need to be taken sufficiently large in terms of rj 
and A. 

By virtue of the results presented in the previous sections Proposition 4.1 
is more than sufficient to infer the formula 

(4.7) ^„ p log p = HA(x){l + o(!5i^)) 

(it suffices to have (4.5) with A = 2 26 + 4 and x 3 ^ 8 ~ v < N < x l l 2 {fogx)~ B 
for some rj > 0 and B > 0). In this formula H is given by (2.17) with 
fjijf'P = 1 + X4(p)(l - ^), whence 

H = n (i - X4(p)p _1 ) = H 1,X4) _1 = - • 

J.X -jy 

P 


(4.8) 
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Therefore (4.7), (4.8) and (4.2) yield the asymptotic formula (1.1) of our main 
theorem. Note that in the formulation of Theorem 1 we restricted to repre¬ 
sentations by positive integers thus obtaining a constant equal to one fourth 
of that in (4.7). 

It remains to prove Proposition 4.1, and this is the heart of the problem. 
In this section we make a few technical refinements of the bilinear form B(x; N) 
which will be useful in the sequel. 

First of all the coefficients (3(n ) can be quite large which causes a problem 
in Section 9. More precisely we have \0(n)\ ^ r(n) so the problem occurs for 
a few n for which r(n) is exceptionally large. We remove these terms now 
because it will be more difficult to control them later. Let B'{x\ N) denote the 
partial sum of B(x; N ) restricted by 

(4.9) r(n) < r 

where r will be chosen as a large power of log x. The complementary sum is 
estimated trivially by 

Y'Y', /i 2 (mn)T(n)a mn < r _1 Y52 ^ 2 (mn)T{n) 2 a mn = t~ 1 Y r 5 (n)a n . 

mn^x mn^-x n^x 

r(n)>r 

By Lemma 2.2 we have 75 ( 71 ) < r(n) log5/,log2 < (2 r(d)) 7 for some d \ n with 
d < n 1 / 3 . Hence the above sum is bounded by 

Y (2 T(d))‘A d (x) < A(x) T ( d ) 7 9 {d) < A(x){\ogxf 7 , 

1 1 
d^x^ d^.x"5 

which gives 

(4.10) B(x; N) = B'(x; N) + O [t^ 1 A(x)( logx) 128 ) . 

To make this bound admissible for (4.5) we assume that 

(4.11) r ^ (log x) A+l2i . 

While the restriction (4.9) will help us to estimate the error term in the 
lattice point problem it is not desired for the main term because the property 
r(n) < r is not multiplicatively stable (to the contrary of (n,n) = 1). In the 
resulting main term in Section 10 we shall remove the restriction (4.9) by the 
same method which allowed us to install it here. 

In numerous transformations of B(x;N) we shall be faced with techni¬ 
cal problems such as separation of variables or handling abnormal structures. 
When resolving these problems we wish to preserve the nature of the coef¬ 
ficients P(n) (think of 0(n) as being the Mobius function). Thus we should 
avoid any technique which uses long integration because it corrupts 0{n). 
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To get hold of the forthcoming problems we reduce the range of the inner 
sum of B'(x; N ) to short segments of the type 

(4.12) N' < n < (1 + 6)N' 

where 9~ 4 will be a large power of log N, and we replace the restriction mn < x 
by mN < x. This reduction can be accomplished by splitting into at most 9~ 4 
such sums and estimating the residual contribution trivially. In fact we get 
a better splitting by means of a smooth partition of unity. This amounts to 
changing P(n) into 

(4.13) 0(n)=p(n)n(n) fi(c) 

c\n,c^.C 

where p is a smooth function supported on the segment (4.12) for some N' 
which satisfies N < N' < 2N. It will be sufficient that p be twice differentiable 
with 

(4.14) p {j) < (9N)~ j , j = 0,1, 2 . 

One needs at most 20“ 1 such partition functions to cover the whole interval 
N < n < 2 N with multiplicity one except for the points n with \mn — x\ < 9x, 
\n — N\ < 9N or | n — 2N\ < 9N. However, these boundary points contribute 
at most O (0H(.x)(log x) 4 ) by a straightforward estimation so we have 

B'{x- N) = J2 % fa N) + 0 (0H(x)(log x) 4 ) 

p 

where p ranges over the relevant partition functions and B' p (x] N ) is the corre¬ 
sponding smoothed form of B'{x\ N ). To make the above bound for the residual 
contribution admissible for (4.5) we assume that 

(4.15) 9 = (\ogx)~ A 

with A' ^ A. We do not specialize A' for the time being, in fact not until 
Section 18, but it will be much larger than A. In other words 9 is quite a bit 
smaller than the factor 

(4.16) r? = (logx) - " 4 , 

which we aim to save in the bound (4.5). Since the number of smoothed forms 
does not exceed 2 9^ 1 it suffices to show that each of these satisfies 

(4.17) B' p (x;N) <C / 99A(x)( logx) 4 . 

Next we split the outer summation into dyadic segments 


(4.18) 


M < rn A 2 M ; 
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to save space we shall sometimes write this as m ~ M. Since the contribution 
of terms with m < $xN is absorbed by the bound (4.17) we are left with 

(4.19) fix < MN < x . 

Note that (4.6) and (4.19) imply M > N since we may require B ^ 2 A. 

After the splitting we obtain bilinear forms of type 

(4.20) B* (M, N) = Y^Z a(rn)/3(n)a mn , 

(m,n)=1 

where we allow a(m) to be any complex numbers supported on (4.18) with 
\a(rn)\ < 1, while f3(n) are given by (4.13). Here and hereafter, in order to 
save frequent writing of the summation conditions 

(4.21) (n, n) = 1 

(4.22) r(n) < r 

we rather regard these as restrictions on the support of (3(n). Occasionally it 
will be appropriate to remind the reader of this convention. It now suffices to 
show that for every N satisfying (4.6) and MN satisfying (4.19) we have 

(4.23) M(MN)* (log MN) 4 . 


5. The bilinear form in the sieve: Transformations 

Typically for general bilinear forms one applies Cauchy’s inequality in 
order to smooth and then to execute the outer summation. However in the 
case of our special form B*(M , N) the application of Cauchy’s inequality at this 
stage would be premature. This is due to the multiplicity of representations 

(5.1) dmn = ^ ' 1 

a 2 +c 4 =mn 

where a, c run over all integers. This multiplicity is locked into the inner sum 
and we do not wish to amplify it by squaring since that would have fatal effects 
on the harmonic analysis when the time comes to count the lattice points. 
Therefore, we release that part of the multiplicity which is accommodated 
in the outer variable and we smooth out this part rather than amplify by 
applying Cauchy’s inequality. In order to be able to extract this hidden part 
of the multiplicity we shall write the solutions a 2 + b 2 = mn in a, 6 G Z in terms 
of Gaussian integers w,z G Z[i]. Not only is our entry to the Gaussian domain 
necessary but it will also clarify the arguments. The arithmetic we are going 
to apply lies truly in the field Q(i). On the other hand, some of our arguments 
such as the quadratic reciprocity law, seem more familiar when performed in 
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Q. Thus we make no appeal to properties of the biquadratic residue symbol, 
instead we work with the Dirichlet symbol (see (19.11)) which is an extension 
of that of Jacobi to Z[i]. 

Since (m, n) = 1 we have by the unique factorization in Z[i\ 

(5.2) a mn = i ^ ^2 3(ReI uz) 

\w\ 2 =m \z\ 2 =n 

where ^ accounts for the four units 1 , z,i 2 ,z 3 in Z[*]. Since n is odd so is z. 
Multiplying w and z by a unit one can rotate z to a number satisfying 

(5.3) z = 1 (mod 2(1 + i )) . 

Such a number is called primary; it is determined uniquely by its ideal. In 
terms of coordinates z = r + is the congruence (5.3) means 

(5.4) r = 1 (mod 2) 
and 

(5.5) s = r — 1 (mod 4) 

so that r is odd and s is even. 

By (5.2) we can express the bilinear form (4.20) as 

(5.6) B*(M,N) = a w &3(R ewz) 

(ww,zz) = l 

where a w = a(|w;| 2 ) and (3 Z = (3(\z\ 2 ). Here we assume that z runs over 
primary numbers so the multiplicity four does not occur in (5.6). In the sequel 
we regard (3 Z as a function supported on numbers having a fixed residue class 
modulo eight, say 

(5.7) z = zo (mod 8) 

where zq is primary. This can be accomplished by splitting B* (M, N) into 
eight such classes. Recall that we also have the restrictions for the support of 
(3 Z coming from (4.21), (4.22). These read as 

(5.8) (z, n) = 1 

(5.9) r (M 2 ) < t . 

To obtain the factorization (5.2) it was very convenient to have the con¬ 
dition (m, n) = 1. This condition, which meanwhile has become ( ww,zz ) = 1 
in (5.6), is now a hindrance since we want w to run freely. We shall remove 
the condition (ww, zz) = 1 by estimating trivially the complementary form 

B(M, N) = EE a w (3 z 3(Rewz). 

(ww,zz) 7^1 
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To this end we take advantage of (5.8) getting 


b ( m , n )« y , EE EE 

P>P M<u 2 +v 2 ^2M N<r 2 +s 2 ^2N 
u 2 -\-v 2 =0(p) r 2 +s 2 =0(p) 

ur+vs=\3 


\(3(r 2 + s 2 )|. 


Since f3(n) is supported on odd squarefree numbers we have (r, s) = 1. Note 
thatpfrs. Put ur + vs = c 2 with |c| < Given c, r, s the residue class 

of u(mod ps / (c, p)) is fixed and then v is determined. Therefore the number 
of points w = u + iv is bounded by 0(1 + \[M (c, p) / ps) . By symmetry we 
can replace this by 0(1 + \J~M(c,p)/pr ) and hence by 0(1 + y/M(c,p)/py/N). 
Summing over c and noting that P <C (MAT) i, we find that our complementary 
form satisfies 

B(M,N) < (MAT) 3 (l + VM/PVN^J E E r (")i/ J (")i 

p N<n^2N 
n=0(modp) 

< (MAT) i (l + VM/PVN) N(\og N ) 2 . 


Hence, adding B(M,N ) to B*(M,N) we conclude that 

(5.10) B*(M,N) = B(M,N) + 0 + P - 1 AfiA/"i)(log AT) 3 ) 

where B(M,N ) is the free bilinear form 

£>(M, AT) = EE aM>^3(Rew2:) . 

w z 

Note that the error term in (5.10) is admissible for (4.23) if N < 1 W\/MN and 
•&9P ^ 1. The first condition is satisfied if 

(5.11) B > |A + A' , 

by virtue of (4.19) and (4.6). The second condition requires P ^ (log x) A+A ’. 
Actually in (5.24) below we shall impose a stronger condition for P. 

Next we are going to assume that j3 z is supported in a narrow sector 


(5.12) ip < arg z < (p + 2-7T0 

for some — 7 r < tp < it where 9 is the same as in (4.12). This can be ac¬ 
complished by splitting according to a smooth partition of unity (without any 
residual contribution because there is no boundary). We need only a C 2 -class 
partition. In other words we attach to /3 Z a periodic function q(a ) of period 
27 t supported on ip < a < p + 2tt9 such that 

q (j) < 0- ] , j = 0 , 1 , 2 . 
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Thus from now on 

(5.13) f3 z = q(a)p(n)n(n ) E p(c) 

c\n,c^C 

where a = arg z and n = \z\ 2 . Note that (3 Z is supported on numbers 2 = r + is 
with \z\ 2 = r 2 + s 2 squarefree, which implies (r, s) = 1 so z is primitive. Since 
z is also odd we have (z,z) = 1 and this property will prove to be convenient 
in several places. 

The intersection of the annulus (4.12) with the sector (5.12) is a polar box 

(5.14) <8 = {z : N 1 < \z\ 2 < (1 + 9)N r , ip < arg z < + 2tt9} 

of volume 

voliB = tt9 2 N' ~ it9 2 N . 

Since the number of polar boxes is 0(9~ 2 ) we now need to prove that for the 
bilinear form B(M,N ) restricted smoothly to a box we have 

(5.15) B{M,N) < 99 2 {MN) I (log MN) 4 
whereas the trivial bound is 

(5.16) B(M,N) < 0 2 (M1V)2(log N) 2 . 

Indeed, arguing along the same lines as for (5.10) we have 

b(m,n) « EE E w 2 + s2 )\ 

M<u 2 +v 2 ^2M r+isS® 
ur-\-vs =□ 

< M^N ~4 |/5(r 2 + s 2 )| . 

By Lemma 2.2 

\/3(r 2 + s 2 )\ <r(r 2 + s 2 )< 9 E T ( d ) ■ 

d\(r 2 +s 2 ) 

Given such a d, we have 

#{?’ + is G 23: r 2 + s 2 = 0(d)} <C 9 2 Np(d)d ~ 1 . 

Moreover we have 

E p ( d ) r ( d ) d ~ 1 < (log IV) 2 . 

These estimates yield the bound (5.16). With more work one could replace 
(log IV) 2 by log IV but we seek the saving of a factor i? -1 which is an arbitrary 
power of log TV. 
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We can assume that 

| ip (mod ^) | > ivd 

because the other sectors, altogether of angle < 27n?, contribute no more than 
the bound (4.23) by the estimate (5.16). For z = r + is in any remaining polar 
box we are not near either axis, and hence 

•&y/N < |r|, |s| < 2 VN . 

Be aware also that r, s have hxed signs depending only on ip. 

Recall that a w are bounded numbers with \w\ 2 in the dyadic segment 

(4.18). By Cauchy’s inequality 

(5.17) B 2 {M , N) < MV(M, N ) 
where 

(5.18) V(M,N) = I E /^2:3(ReU7^) | 

W Z 

Here we have introduced a smooth majorant / to simplify the forthcoming 
harmonic analysis. We choose an / that is supported in the annulus 

(5.19) \y[M < |w| < 2s/M . 

Also, it is convenient to take / to be radial, that is f{w) = f(\w\). Now we 
need to prove that 

(5.20) V(M, N) < {> 2 6 a M^N^ (log MNf . 

Squaring out we get 

(5.21) V(M,N) = Pz! /3 Z2 3 (Re wzi )3 (Re ) . 

W Z\ Z2 

Here we want to insert the condition (zi,Z 2 ) = 1 which will give us a sum that 
is easier to work with. Since Z 1 Z 2 is coprime with n it will turn out, as we next 
show, that we can do this at a small cost. 

First we require a trivial bound for T>(M , N). To begin note that we have 
\V(M,N)\ < t 2 D(M,N ) where 

D(M, N ) = E E'E* 3 (Re wz\ )3 (Re wz 2 ). 

|ui| 2 ~M \zi\ 2 ,\z 2 \ 2r ^N 

Here the * indicates summation over primitive z. 

Lemma 5.1. For every M > N ^ 2, 

D(M,N) < (log MN) 5U . 
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Proof. The contribution of the diagonal |zi| = \z 2 \ is 

D=(M,N) < EE* 3(ReiP2:) < (MAI) 4 log MAI 

w z 

by the argument that gave (5.16). The remaining terms of D(M, N ), those off 
the diagonal, have 

A(zi, z 2 ) = Y i (z 1 z 2 - ziz 2 ) / 0 

because (z\,zi) = (z 2 ,z 2 ) = 1 and |zi| ~ \z 2 \. These contribute 

D^M,N)« Y.T. E’E* v 

HU-sIcW i|( W4W )ja) 

That c\z 2 — c\z\ / 0 follows from (6.3) and (6.4) below. 

Using the rectangular coordinates z\ = rr + isi, z 2 = r 2 + is 2 we have 
A = 7TS2— r 2 si, c\r 2 = c^n (mod |A|) and cfs 2 = c|si(mod |A|). By symmetry 
we can assume c\s 2 — c\s\ / 0. For given ci, C 2 , si, s 2 , A / 0, the number n is 
fixed mod si/(si,S 2 ) and then r 2 is determined. The number of pairs r\,r 2 is 
bounded by VN(s\, s 2 )/si. Hence, letting 5 = (si,S 2 )> «i = , s 2 = 5s we 

get 



By Lemma 2.2 with k = 4 there exists d < (8\/M N) 4 such that we have 
c ^2 = c|s^(modd) and t(c 2 S 2 — c|s*) <C r(<i) 8 . Hence 

r(c?sa - 44 ) < T (d) 8 XX 1 

c\s* 2 + c \s\ d<(MN)Z c l s 2= c 2 s l (mod d) 

« (MAT)5 ^ T(d) 9 d _1 < (MAI)5(logMAT) 29 . 

d<(MN)1 

Summing over then <5 we conclude that 

N) < M3Aii(logMAT) 2+29 . 

Combining this with the estimate for the diagonal contribution D= we complete 
the proof of Lemma 5.1. □ 

Now we are ready to reduce the sum P(M, A) to the corresponding sum 
restricted to (z\,z 2 ) = 1. We denote the latter by 

V*(M,N ) = X/M XX Ai^ 2 3( ReIZJ ^i)3( Re ^2) • 

(zi,z 2 ) = l 
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We shall prove that the difference T>(M , N ) between these two sums satisfies 
V(M,N) < t 2 (Mijvi +P -1 M3iV§) (log MN) 516 . 

Indeed, denoting Gaussian primes by it we find that T>(M, N ) is bounded by 

^2 ^2 ^2^2 I Z^ttzi /3ttz 2 13 (Re Ie7r ) 3 (Re HJ7r ) 

P<\n\ 2 ^N M<\w\ 2 ^2M JL_<| 2l k 

kr kr 

< t 2 R>(MPi, ATPf 1 )(logMIV) 2 

for some Pi with P < P\ ^ N. The result now follows from Lemma 5.1. 

Subtracting the above bound for T>(M,N ) from T>(M,N) we conclude 

that 

(5.22) V(M,N ) = V*(M,N) + 0 (r 2 (M*m + P _1 M^ JV§) (log MIV) 516 ) . 

Observe that the first error term in (5.22) is admissible for (5.20) provided 
that 

(5.23) r < xH 

by virtue of (4.19) and (4.6), and the second is admissible if 

(5.24) P > r 2 (log x ) 2 ^+ 4 ^'+508. 

Under these conditions (5.23) and (5.24) it remains to prove that 

(5.25) V*(M,N) < {> 2 6 a M^N § (log MN) 8 . 

Changing the order of summation we arrange our new sum as 

(5.26) V*(M,N) = EE Pz 1 P Z2 C(zi,z 2 ) 

(21,22)=! 

where 

(5.27) C(zi,z 2 ) = ^2f{w)5(R£wzi)3(Ttewz 2 ) • 

W 

The last is a free sum over Gaussian integers w. Note that the restrictions on 
the support of /3 Z which we have imposed so far and the summation condition 
(zi,z 2 ) = 1 in (5.26) imply that (zi,~zi) = (z 2 ,z 2 ) = 1 and z\ = 22 (mod8). 

6. Counting points inside a biquadratic ellipse 

In this section we evaluate approximately C(z 1 , z 2 ) defined by (5.27). The 
problem reduces to counting lattice points inside the curve 

tf — 2yi 2 t| + if = x 
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for fixed 7 with 0 < 7 < 1, namely 7 = cos («2 — a 1) where 07 = arg Zj . This 
particular curve arises because of the particular choice (3.14) of j(b) as the 
Fourier coefficient of the classical theta function. The counting requires quite 
subtle harmonic analysis because the points involved are also constrained by 
a congruence to a large modulus, so there are very few points relative to the 
area of the region. 

The modulus to which we have referred above is the determinant 

(6.1) A = A(zi, z 2 ) = lmz 1 z 2 = -pp(ziz 2 - Z]Z 2 ) = \ziz 2 \ sin(a 2 - « 1 ) • 

Zi i 

Note that (A, \ziz 2 \ 2 ') = 1 because ( z\,z 2 ) = 1 and {z\,~z\) = (z 2 , z 2 ) = 1. 
In particular, this co-primality condition implies that A does not vanish. For 
z\, z 2 in the same polar box (5.14) we have 

(6.2) 1 < |A| < 4vr 6N . 

The variable of summation w in (5.27) which runs over Gaussian integers 
can be parameterized by two squares of rational integers; 

(6.3) ReTczi = cf, Yiewz 2 = c 2 , 
say, with ci,C 2 £ Z. Indeed these values determine w by 

iAw = cfz 2 — c 2 z\ . 

As w ranges freely over Z[i\ the above equation is equivalent to the congruence 

(6.4) c\z 2 = c 2 z\ (mod |A|) . 

Using the rectangular coordinates z\ = r\ + zsi and z 2 = r 2 + is 2 it appears 
that (6.4) reads as two rational congruences 

ujVi = c^r 2 (mod |A|) 
cfsi = c|s2 (mod |A|) 

with 
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Next we split the summation into residue classes modulo |A|, say 


C(z 1 ,z 2 ) = EE C(zi,z 2 ; 71 , 72 ) , 

7 i22=72 z l(mod l A l) 

where 

C{z!, z 2 ; 71 , 72 ) = f{(<%Z 2 -<$zi)/A) . 

(0,02)^(71,72)(mod | A|) 

Then for each pair of classes we execute the summation by Poisson’s formula 
obtaining for C(z\, z 2 \ 71 , 72 ) the Fourier series 

|A|- 1 |ziZ 2|-3 (hi\Az 2 \~^,h 2 \Azi\~^ e (( 71/11 + 7 2 h 2 )\A\~ 1 ) , 


hi h-2 


where 


( 6 . 8 ) F{ui,u 2 ) = J j f ~ e(uiti + u 2 t 2 )dti dt 2 . 

Hence we obtain 

(6.9) C(zi,z 2 ) = \ziz 2 \~? EE F (hi\Az 2 \ 2 , h 2 \Azi\ 2 ) G{hi,h 2 ) 

hi h.2 

where G{h \, h 2 ) is the sum over the residue classes 71,72 to modulus |A|, 


(6.10) G(h\,h 2 ) = e ((7i^i+72/i2)|A| x ) . 

7j 22=7! Z1 (mod | A |) 

Have in mind that F(u\,u 2 ) and G{h\,h 2 ) depend also on z\,z 2 . Naturally 
the main contribution comes from li\ = h 2 = 0. In this case we display the 
dependence on z \, z 2 by writing 


(6.11) Fq{zi,z 2 ) = J J f 
and 

(6.12) Gq{zi, z 2 ) = |^7# { 71 , 72 ; 7i^2 = 72^1 ( m od | A|)} 

which stand for F( 0,0) and G(0, 0) respectively. Thus the contribution of the 
zero frequencies to C(zi,z 2 ) is 

(6.13) C 0 (zi,z 2 ) = \z!z 2 \~^F q (zi, z 2 )G 0 {zi, z 2 ) . 

In the next two sections we compute the Fourier integral F{u\,u 2 ) and 
the exponential sum G(h\, h 2 ). 
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7. The Fourier integral F{u\,u 2 ) 


Recall that / is a radial function, say 

f{w) = /(H) = f(M 2 ) , 

where f is a smooth function supported on [|M,4M], We shall assume that 
If^l ^ M -J for 0 ^ j ^ 4. Putting 


g(h,t 2 ) = 


Z 2 ,2 Z 1 ,2 
R 1 “ N 2 


we have 
(7.1) 

Here the domain of integration is restricted by the support of f(g). We have 


F(ui,u 2 )= / / j(g(ti,t 2 ))e(uiti + u 2 t 2 )dt l dt 2 . 


g(t 1 ,t 2 ) = tf-2Re^ T t 2 l t 2 2 + 4 


\ZlZ 2 \ 


and 


Z\Z 2 

\ziZ 2 \ 


= cos(«2 — aq) + i sin(a2 — a i) = 7 + iS , 


say, where otj = argZj. Note that 7 is close to 1 and |<5| is small because z\,z 2 
are in the same polar box (5.14). Precisely 

(7.2) <5 = Alzi^l -1 = sin(a 2 — a 1 ) , 

so that 


(7.3) (2 N)~ l < |<J| < 4t t6 . 

Applying the above notation we write several useful expressions for the quartic 
form g(h,t 2 ): 

g(h,t 2 ) = t\- 2^44 + 4 

= (4 - 7^l) 2 + s 2 4 = (4 - it 2 ) 2 + s 2 tf. 

Hence 

4g(ti,t 2 ) + 25 2 {t\ + 4) > d 2 (4 + 4) 2 ■ 

We have also 

(4 - 4f < g(h,t 2 ) < (if +t 2 ) 2 • 

Since |M < g < 4M by the support of f these inequalities imply 

| t 2 - 4\ < 2 AO , 

\m\ <4 + 4 <4M\\5\- 1 . 
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Hence the area of integration (7.1) is bounded by 0 yM 2 log 5 2 j so by trivial 
estimation 

(7.4) F(ui , u 2 ) < M 2 log N. 

If u\ 0 we can integrate by parts four times with respect to t\ getting 
F(u 1 ,u 2 ) = ( 27 tui ) -4 ( [ ^-T^{g{ti 1 t2))e(uiti+U2t 2 )dtidt 2 . 


dtj 


We have 


d 4 


^4 f(5) = /'V + (W" + 3 g"g") f" + Qg'g'gT + g'g'g'gT , 

and by the above inequalities this is bounded by 

M " 1 + (t\ + t 2 ) 2 M~ 2 + tf(i? + t|)(tf - 7 tl) 2 M~ 3 + t\(t\ - -ftl) 4 M~ 4 


Since 

(*i - 7^) 2 + (t| - 7^1 ) 2 + 5 2 (ff + 4) = 2g(t 1 ,t 2 ) < 8 M 

we deduce that 

^f( 9 )«ru/-. 

This gives, by trivial estimation combined with (7.4), 

F(ui,u 2 )< (1 + ufd 2 M)^ 1 MhogN . 

By symmetry this bound also holds with u\ replaced by u 2 • Taking the geo¬ 
metric mean of these two bounds we arrive at 

F(«i,« 2 ) < (l + ul\5\VMy 1 (y + ul\8\VMy l logN . 

Hence we get by (7.2): 

Lemma 7.1. For u\ = and u 2 = h 2 \Azi\~ 4 ^ 2 the Fourier 

integral (6.8) satisfies 

(7.5) F(u!,u 2 )< (l + /i 2 iL -2 ) -1 (l + h\H~ 2 y l logIV 

— I 1 

where H = M 


In particular, for h\ = h 2 = 0 the estimate (7.5) becomes 
F 0 (zi,z 2 ) < log IV , 


(7.6) 
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but in this case we need a more precise formula. We compute as follows 

/*oo roo 


Fq(zi,z 2 ) = 


f(g(h,t 2 )) dti dt 2 


' — oo J —oo 
(* oo /»oo 


4 J J f ((tf - 2 r jt\t\ + 4) 2 ) dti dt 2 
4 J J f (^u 2 (t 4 — 2'yt 2 + 1 ) 2 ^ ududt 


= mEw, 

where /( 0 ) is the integral mean-value of /, 

r°° 1 

(7.7) /(0) = / f(u)du « M 2 

jo 

and £'( 7 ) denotes the elliptic integral 


f°° _l 1 

(7.8) £( 7 )= / (t 2 - 27 f + l) 2 1~ 2 dt . 

Jo 

Since 7 is close to 1 we have a satisfactory asymptotic expansion (cf. (3.138.7) 
and (8.113.3) of [GR]) 

(7.9) £( 7 ) = log4d ~ 2 + 0(5 2 logd -2 ) . 

Insertion of (7.2) in this gives 

Lemma 7.2. For z\, z 2 in the box (5.14) the integral (6.11) satisfies 

(7.10) F 0 ( Zl ,z 2 ) = f (0)2 log 2 |z!Z 2 /A| + O (^A 2 N~ 2 log N^j . 


8. The arithmetic sum G{h\,h 2 ) 

Recall that G(hi,h 2 ) is given by (6.10). This is a kind of Weyl sum for 
the equidistribution of roots 71,72 of the quadratic form jfz 2 — 7 2 2 T modulo 
|A|. We write (uniquely) 

(8.1) A = A X A 2 
where Ai is squarefree and A 2 ^ 1. The solutions to 

(8.2) 7i^ 2 = 72^1 (mod |A|) 

satisfy (y 2 ,A) = (y 2 , A) = d\d^, say, where d\,d 2 ^ 1 and d\ is square- 
free. This implies o?i|Ai, d 2 |A 2 and (d\, A 2 /d 2 ) = 1. Moreover 71 = d\d 2 iji, 
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72 = did,2r]2 where 771,772 run over the residue classes to modulus |A|/g?ig? 2 and 
coprime with \A\/did%. Accordingly G(hi,h,2) splits into 

G{h 1 ,h 2 ) = |A| _1 EE EE e ((771/11 + mh2)/bibld 2 ) ■ 

6 idi=|Ai| 771,772 (modbife|(i2) 

7 > 2 d 2 =A 2 (^71 ? 72 ,^>1 ^>2)— 1 

(<ii,62) 1 rj^z2=ri^zi(mod 6i6|) 

The innermost sum vanishes unless h\ = h 2 = 0 (mod d 2 ) so the full sum is 
equal to 

|A| _1 Y Y d 2 Y Y e ((dihi + mh 2 )/bibld 2 ) ■ 

feldl = |Ai| ^>2^2—A2 »7l,r)2(mod6ife|) 

(^1,62)=! <ki\(hi,h 2 ) Tffz 2 =rj%zi(iiiodbil%) 

Changing 772 into cuift we conclude that G{h \, h 2 ) is given by 

( 8 . 3 ) |A| _1 Y Y d % Y R((h 1 +uh 2 )d 2 1 ;bibf) 

6 idl=|A| ^> 2 ^ 2 —A2 aJ 2 =^ 2 /^i(niodbi 6 |) 

(d 1 ,fe 2 ) = l d2|(/ll,/l2) 

where i?(h; 6) denotes the Ramanujan sum 

w)= E' <*)■ 

77 (mod 6) 

Using the well-known bound \R(h]b)\ < ( h,b ), we obtain 

|i? ((hi + cdh 2 )c ?2 &1&2) | < ((hi + coh^d ^\ Mi) 

< ((h? - u?hl)d,2 2 ,bfi>l) 

s? (21 h 2 - z 2 h|, A) d^ 2 . 

Denote by 77.(2; b ) the number of solutions in rational integer classes w 
modulo 6 of the quadratic congruence 

( 8 . 4 ) lo 2 = z (mod b). 

Incidentally notice that for 2; = —a 2 (mod 6 ) we get the arithmetic function 
n(—a 2 ]b ) = p(a;b ) which was considered in Section 3 . Of course, 77(2; b) van¬ 
ishes if 2 is not congruent to a rational integer, however in our case 2 = 22/21 
is rational modulo |A| and prime to A; see ( 6 . 6 ). Applying the trivial bound 
n{z2/z\ \ bib 2) < 4 t(M 2 ) together with the above bound for Ramanujan’s sum 
we deduce by ( 8 . 3 ) that 

Lemma 8 . 1 . For any hi, h 2 the exponential sum ( 6 . 10 ) satisfies 

( 8 . 5 ) |G(hi,h 2 )| < 4 r 3 (A)|A| _ 1 ( 2 ih 2 - z 2 hj, A) . 

In particular for hi = h 2 = 0 the estimate ( 8 . 5 ) becomes 

G 0 (21,22) < 473(A) 


( 8 . 6 ) 
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which is almost best possible but in this case we need an exact formula. Since 
i?(0; b) = (p(b) we obtain by (8.3) 

(8.7) G 0 (z 1 ,z 2 )= Y d i l (b 1 bly 1 ip{b 1 bl)n(z 2 /z 1 -bibl) . 

6idi = |Ai| 62 G? 2 =A 2 

(6 2 ,di)=l 

On the other hand by (6.12) we have |A|Go(zi, z 2 ) = N(z 2 /z\; |A|) where 
N(a; q ) denotes the number of solutions to 

( 8 . 8 ) ay? = 7 1 (mod q) . 

Since N(a\ q) is multiplicative we have 

(8.9) G q (z 1 ,z 2 )= p~ u N (z 2 /z 1 -p u ) . 

p v || A 

This expression reduces the problem to local computations. 

Lemma 8.2. For p f 2 and ( a,p ) = 1 we have 


(8.10) p-“N(a-y) = 1 + (l - 1) ( M + [L±l 

For p = 2, v > 1, a = l(mod 8 ) we have 



( 8 . 11 ) 


2~ v N{a] 2 U ) = v . 


Proof. One could proceed by counting the solutions to ( 8 . 8 ) directly, how¬ 
ever we use the formula (8.7). This gives 

p~ v N (a]p u ) = A(i - (-If) + Y p~ a p(p a )n(a;p a ) 

QE^(mod 2) 

where the first term is present only if 2 \ v and it comes from d\ = p in (8.7). 
For p f 2 and a > 1 we have 

(8.12) n(a;p a ) = X + (j) 

which leads to (8.10). For p = 2 and a = l(mod 8 ) we have n(a‘p a ) = (4,2 a_1 ) 
yielding ( 8 . 11 ). □ 

In the formula (8.10) we write 



which leads to the following global expression: 
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Corollary 8 . 3 . 
( 8 . 13 ) 


For q odd and (a, q) 

1V( a;q )= q ^?f 

d\q 


1 we have 


By (8.9), (8.11), and (8.13) we infer that 


(8.14) 



dodd 


where v is the order of 2 in A, that is A = 2 V A' with A' odd. Note that 
22/21 = l(mod8) due to (5.7), so v ^ 3. 

To accommodate the factor v we extend the Jacobi symbol to even moduli 
by setting 


(8.15) 




if 2 fa, 


where d! denotes the odd part of d. Now we conclude by (8.14) 


Lemma 8 . 4 . Suppose that (21,22) = (21,21) = (22,22) = 1 and also 
z\ = Z 2 (mod 8). Then the number of congruence pairs of solutions in (6.12) 
with given A = A (21, 22) = Im z\z 2 is expressed by 


(8.16) 



9. Bounding the error term in the lattice point problem 

In this section we combine the results of the previous two sections complet¬ 
ing the estimation of the error term in the lattice point problem of Section 6. 
The estimation of the main term will take the rest of the paper. 

By (6.9), (7.5) and (8.5) we obtain 

(9.1) C(zi, 2 2 ) = C 0 ( 21 , 22) + O (t 3 (A)\A\- 1 H(z 1 , z 2 )MzN~ i log n') 
where £ 0 ( 21 , 22 ) is given by (6.13) and 

(9.2) 7f(2i,2 2 )= ( 24^1 “ 22 / 12 , A)(l + /ifTT 2 ) 1 (1 + h\H~ 2 )~ l . 

(/n,/i2)^(0,0) 

1 3 

Recall that H = M 4 AA. For aesthetic reasons only we would like to estimate 
77(2 i, 22 ) for individual 21 , 22 , however the effective range of summation is too 
short to do so. Note that for N in (4.6) and MN satisfying (4.19) we have 

(9.3) x 71 <H <m . 
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Thus h i, h ‘2 are small indeed, nevertheless (z\ }{\ — z 2 ^ 2; A) can be quite large 
for some points z \, z 2 ■ For this reason we take advantage of the additional 
summation over z\ , z 2 which is present in our main problem. 

Given hi, h 2 not both zero we put 

(9.4) A = A(z 1 ,z 2 ) = zih\ - z 2 hl . 

We begin by estimating the sum 

(9.5) Z(hi,h 2 ) = EE |/3 zl /3 Z 2 |r 3 (A)(A,A)|Ar 1 

(zi,Z 2 ) = 1 

where A = A(zi, z 2 ) is the determinant defined in (6.1). We have 1 < |A| < N 
and, because of (5.9), 

(9.6) \/3 z \ < t{\z\ 2 ) < t. 

Hence 

Z(h u h 2 ) ^2t 2 (log N)D^ EE t 3 (A)(A,A) 

21,22633 

D<|A| <2 D 

for some 1 ^ D < N where <8 is the polar box (5.14). Next we group terms 
according to the value of (A, A) = d say, getting 

Z{hi,h 2 ) A 2r 2 (log N)D~ 1 X ] d XX T3 ( a ) • 

d< 2 D 21,226® 

D<|A| <2 D 
A=A= 0 (mod d) 

For further computations we use the rectangular coordinates z\ = r\ + is\ 
and z 2 = r 2 + is 2 with ri,r 2 ,s\, s 2 satisfying (5.4), (5.5). Observe the relations 

A(?’i,r 2 ) = A(si,s 2 ) = O(modd), 

A(rr, r 2 )s 2 - A(si, s 2 )r 2 = A(zi, z 2 )h\, 

A(n, r 2 )si - A(si, s 2 )ri = A(zi, z 2 )h\. 

Since A(zi,z 2 ) / 0 and h\ + /?| / 0 we have either A(?q,r 2 ) or A(si,s 2 ) 
different from zero, thus we may assume that 

0 / A(ri,r 2 ) = 0 (mod d). 

Given ri, r 2 and any value of A the number of ordered pairs si, s 2 which give 
z\,z 2 in <8 satisfying rrs 2 — r 2 s\ = A is 0(i? _ 1 (rT, r 2 )). Moreover we have 

^2 T 3 (A) < Tz{d)d- 1 D{\og2Df . 

D^\A\< 2 D 

A= 0 (modd) 
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Therefore 

Z[hi,h 2 ) <C $ -1 T 2 (log N) 3 EEE r 3 (c?)(ri,r 2 ) 

d r\ 7*2 
0^A(ri,r2)=0(d) 

= '!? _1 r 2 (log N) 3 EE (fi, r 2 )r4(A(ri, r 2 )). 

A(ri,r 2)^0 

Once again we have a difficulty to complete the summation, but this time 
in the variables ri,r 2 because of the divisor function T 4 (A(ri,r 2 )), especially 
when p = (ri,r 2 ) is large. For this reason we return to the summation in 
h\, h 2 . We need to estimate 

L(ri,r 2 )= T ^{nh\~r 2 hl){l + hlH~ 2 )~ l (l + hlH~ 2 )~ 1 

r\h\^r 2 h^ 

where H satisfies (9.3). Note that we can restrict this series to h\,h 2 <C H 3 
because the contribution of the other terms is absorbed by the term with 
h\ = 1, h 2 = 0. In this truncated series we estimate r^rih 2 — r 2 h 2 ) by r(q) c 
for some q < H with r\h\ = r 2 /i 2 (rnodg), where c is a constant depending only 
on y. Specifically we may use Lemma 2.2 for n = |ri/r 2 — r 2 h\\ and note that 
n < A\fNH & < H 1 > 2ri by (9.3) so it suffices to take c = y _1 log?? -1 - Therefore 

L(n,r 2 ) « 2 r(q) c t 1 + + h 2 H~ 2 )- 1 

q^H ri/i^=r2/i|(mod q) 

where the restriction to hi, h 2 <C H 3 is no longer required. Splitting into 
residue classes to modulus q we get 

(9.7) L{n,r 2 ) < H 2 ^ r{q) c q~ 2 N{ri,r 2 -, q) 

q^H 

where N(a,b]q ) denotes the number of solutions yi,y 2 to 

(9.8) ay 2 = fryf (mod q) . 

If (b, q) = 1 then N(a, 6; q) is equal to N(ab; q) and the latter was evaluated 
in Corollary 8.3 in the case (2 ab,q) = 1. Now we give a general estimate. 

Lemma 9.1. For any integers a,b,q with q^ 1 we have 

(9.9) N(a, b ; q ) < ([a, b], q)qr(q) . 

Proof. By multiplicativity we can assume that q is a prime power, say 
q = p v . For q prime the bound is trivial if p\ab, while if not it reduces for each 
y 2 to the congruence y 2 = r(modg) which has at most two solutions. From 
now on suppose u > 2. If p \ a and p \ b then we reduce to the case q = p u ~ l by 
dividing through by p. If p \ a and p \ b then p 2 \ a so we can divide by p 2 and 
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reduce to the case q = p u ~ 2 getting (by induction) a bound p~ 4 (ab, q)qr(qp ~ 2 ) 
for the solutions modulo qp~ 2 ■ Multiplying this bound by p 4 we get the result. 

There remains the case p \ ab. If p | 72 then also p | 71 and these give by 
induction a contribution no more than 

qp~ 2 r(qp~ 2 )p 2 = qr(q) — 2 q . 

The other solutions satisfy p \ 7172 and for each of the ip(q) values of 72 there 
are at most two values of 71 except for q even in which case there are at most 
four. Therefore the primitive solutions contribute at most 2 <p(q) < 2 q for q 
odd and 4 ip{q) = 2 q for q even. Adding these contributions we obtain (9.9). □ 

Inserting (9.9) into (9.7) we get 

L{n,r 2 ) < H 2 E q l T{q) c+1 {[r 1 ,r 2 ],q) < r([n, r 2 ]) c+ 2 D 2 (log H) 2 ° +1 . 
Moreover we have 

EE(-^ 2 ) t ([-^ 2 ]) c+ 2 <E^) c+2 ( E r ( r ) c+2 ) 

r l r 2 p ' pr< 2 \/N ^ 

< IV(logA^) 2C+4 . 

From these estimates we conclude by (5.26) and (9.1) that 

(9.10) V* (M, N) = V 0 (M,N) + O (VVA^loglV)^ 

for some b depending on rj; precisely b = 2 C+4 + 2 C+1 + 4 <C where r/ is 

fixed in Proposition 4.1. Here the main term is 

(9.11) D 0 (M,N) = EE Pz 1 P Z 2 C 0 (z 1 ,Z 2 ), 

(zi,22)=l 

and the error term is, by (4.6), (4.15), (4.16), and (4.19), admissible for (5.25) 
provided that 

(9.12) r<( logx)^ B ~^ A ~ 2A '-^ b . 

10. Breaking up the main term 

It remains to estimate the main term (9.11). Recall that C$(z\,z 2 ) is given 
by (6.13) in terms of the Fourier integral (6.11) and the arithmetic sum (6.12). 
Although we refer to the sum Do(M, N) as the “main term” since it originated 
from the leading term in the lattice point problem it is in fact, in contrast 
to what is usually called by that name, smaller than it would appear at first 
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glance. This however is for reasons more subtle than those responsible for the 
size of what we have called the error term. 

The lacunarity of the original sequence A = (a n ) has featured in the 
estimation for the error term in (9.10), and it is no longer an issue in the main 
term T>q (M, TV). Thus it is quite easy to derive by trivial estimation the bound 

(10.1) V 0 (M,N) < 0 4 M^Tvi (log TV) 8 

which of course barely misses what one wants, namely (5.20). The required 
improvement by a factor d 2 (actually a saving of arbitrary power of log TV) 
will result from the cancellation of terms due to the sign change of f3 Zl j3 Z2 
which involve the Mobius function. These are twisted by the arithmetic kernel 
Gq{zi,Z 2 ) which is rather sophisticated; it involves the Jacobi symbol 

(10.2) wteAl) = 

which in turn originated (in Section 8) from the roots to the congruence 

(10.3) u 2 = z^lz\ (mod |A|) . 

Had the twist of (3 Zl j3 Z2 been the smooth function Fq(zi , 22 ) alone, or for 
that matter a separable arithmetic function of suitable character, we would 
be able to receive the cancellation quickly from an excursion into the zero-free 
region for Hecke L-functions with Grossencharacters. However the presence 
of the symbol Xd(z 2 /zi) to very large moduli d obscures the situation and we 
need modern technology to resolve the problems. Among our arguments one 
can find some traces of automorphic theory but we do not dwell on these here. 
Retrospectively, the treatment of the main term T>q(M, TV) should be regarded 
as the core of our proof of the main Theorem 1, though it did not seem a 
central matter when we got the very first ideas. 

In this section we relax some factors in the main term by using familiar 
approximations and we break up what is left into three parts according to the 
size of the moduli (small, medium, large) to be treated separately by three 
distinct methods in the forthcoming sections. 

Inserting (6.13) into (9.11) we get by the approximations (7.10) and (8.6) 

V 0 (M,N) =2/(0) EE (3 zl f3 Z2 G Q (zi, z 2 )\ziz 2 \ 2 log 2\zi z 2 /A| 

(Z1,Z2)=1 

+ O (log TV) I^J a2 N(A)A 2 

(21,22)=! 

Recall that (5 Z are supported on primary numbers in a polar box (5.14) and 
are restricted by (5.8), (5.9). We no longer need, nor wish to have, this last 
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condition r(| 2 ;| 2 ) ^ r. We remove this in the same way as we have installed it 
between (4.9) - (4.11). We also simplify slightly by inserting 

\ziZ2\~* = (l + O(0))N~s 

(see (5.14)), and we use (6.2), (7.7), (8.6) to arrive at 

(10.4) V 0 (M,N) = 2/(0)fV-3T(/3) + 0 (( t " 1 +0)Y(P)M7N~* log N) . 
Here 

(10.5) T{(5) = EE Pzi P Z2 Go (Zi , ^ 2 ) log 21 Zi z 2 / AI 

(Z1,Z2)=1 

and 

(10.6) Y(P) = l&iflJ^MVlM 2 ) 7 ^) . 

(zi,Z2)=l 

From now on the condition r(|z| 2 ) < r no longer exists, however we are 
left with the parameter r in (9.10) and (10.4) which may be chosen at will. 

Lemma 10.1. For any (3 Z supported in the polar box (5.14) and satisfying 
\/3 z \ < t(|z| 2 ) we have 

(10.7) Y(P) < 0 4 lV 2 (logiV) 219 . 

Proof. By Lemma 2.2 there exist d,d\, d /2 < N* such that d.d\, d 2 are 
mutually co-prime, 

(10.8) d\A(zi,z 2 ), d\ | |zi| 2 , d 2 \\z 2 \ 2 

(10.9) t 3 (A) < r(d) 16 , r(|zi| 2 ) < r(di) 8 , t(|z 2 | 2 ) < T~(d 2 f . 

Since the number of points z\,z 2 G 03 satisfying (10.8) and (10.9) is bounded 
by O (0 4 (lVlog N) 2 T(ddid 2 ){dd\d 2 ) _1 ) we obtain 

Y(P) < d 4 (NlogN) 2 EEE r(ddid 2 ) 17 (ddid 2 )" 1 , 

d d\ d2 

which yields (10.7). □ 

Inserting (10.7) into (10.4) we get 

(10.10) V 0 (M,N) = 2f {O)N-^T(fd) + O + 9)e 4 M^Nl{log N) 220 ^j . 
Here the error term satisfies the bound (5.25) provided 

(10.11) r ^ i?~ 2 (logx) 22 ° = (logx) 2j4+22 ° 
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and 0 < $ 2 (logiV) 220 . The latter condition is assured since A' can be chosen 
much larger than A, for example 

(10.12) A' ^ 2A + 2 20 

would suffice. The lower bound (10.11) is not in contradiction with the upper 
bound (9.12) provided that B = B(r),A) in the statement of Proposition 4.1 
is sufficiently large. 

Now the last step is to estimate T(/3). Our target is 

Proposition 10.2. For the j 3 z given by (5.13) and restricted by (5.7), 
(5.8) we have 

T(/3) < IV 2 (log N)~ a + P~ 1 N 2 log N 

for any a > 0, the implied constant depending on a. 

Remarks. For the proof of Proposition 10.2 the lower bound for P given in 
(4.4) is never utilized. Of course, since this assumption is now implicit in (5.8), 
the second term on the right side in this proposition is actually superfluous. 
Inserting (8.16) in (10.5) and changing the order of summation we get 

(10.13) T W = 2^'if- ZE AA,(2£)tog2|f^l . 

d (zi,Z2)=l 

A(zi,Z2)=0(4d) 

Note that 1 < d ^ N because 1 < |A| < N. We split this sum into 

(10.14) T(J3) = U(J3) + V(J3) + W(/3) 
where 



where X will be chosen as a sufficiently large power of log IV and /, /* are 
smooth functions whose graphs are 
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Note that V((3) ranges over d with X < d < |A|A _1 (this can be void) 
while IT(/3) ranges over d with d* = |A|/4d < X. Separating these cases with 
the smooth partition / + /* of unity will later help simplify some technical 
details. Each of these three parts will be estimated separately with considerable 
effort by very different methods. 

First we shall deal with V (/?) because for the medium size moduli our ar¬ 
guments are quite general. No special properties of f3 z are needed. The source 
of cancellation is the sign change of the symbol Xd(z 2 /zi), not the sign change 
of the Mobius function which is a component of (3 Z . Therefore we can afford to 
contaminate f3 z considerably when processing technical matters such as separa¬ 
tion of variables. Thus, here we cut the range at |A|/4X which depends on the 
points Z\,Z 2 to save technical work later in the range of large moduli. It will 
take us the next four sections to establish a general result (Proposition 14.1) 
which is adequate for application to V(/3) (see Proposition 15.1). 


11. Jacobi-twisted sums over arithmetic progressions 


Given a sequence A = (a n ) of complex numbers, it is natural to study its 
distribution in various residue classes a mod d. The goal is to establish a good 
approximation to 

(11.1) A(x;d,a) = ^ a n 

n^.x 

n=a(mod d) 


by a simple function of d and with a relatively small error term uniformly 
for d in a large range. For example, in sieve theory one considers the zero 
residue class in which case A(x;d, 0) is well approximated by g(d)A(x\ 1,0) 
with g a nice multiplicative function. When a is not the zero class mod d then 
the expected main term for A{x\d, a) is slightly different. Let us focus on the 
primitive classes a, that is with (a. d) = 1. Among these classes a reasonable 
sequence A = ( a n ) is expected to be uniformly distributed, which means that 


( 11 . 2 ) 


r(x; d, a) 


E an 

n^.x 

n=a(mod d) 


i 

¥>(d) 


n^.x 
(n,d )=1 
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is quite small. Indeed, the theory of L-functions, if applicable, usually leads 
to the bound 

(11.3) r(x;d,a)<C ||»4||x2 (logx)~ 9A 

for any A ^ 1 with the implied constant depending only on A. In the case 
a n = A (n) this result is known as the Siegel-Walfisz theorem. 

Although (11.3) holds uniformly for all primitive residue classes it is useful 
only for relatively small moduli; the bound (11.3) is trivial for d > (log re) . 
However, by the large sieve inequality one can extend (11.3) on average over 
such classes to moduli as large as D = rc(logx) - " 4 . Specifically, one shows that 

(H- 4 ) k(*;d,o)| 2 « \\A\\ 2 x(logx)~ A 

d^D a(mod d) 

with the same A as in (11.3). In the case a n = A(n) this result, in a slightly less 
explicit form, is known as the Barban-Davenport-Halberstam theorem. The 
above generalization is, apart from the explicit power of logrc, given in [BFI]. 
Since we do not use (11.4) in this paper there is no need to provide a proof. 

In this section we begin to investigate twisted sums in arithmetic progres¬ 
sions of type 

(H-5) EE a rs (^) 

rs=a(mod d) 

and we shall continue the study of these sums in the following three sections. 
Here f denotes the multiplicative inverse modulo d. Note this implies (r, d) = 1. 
Also d' denotes the odd part of d and (^) is the Jacobi symbol. We pulled 
out d' because it would be confusing to use here the convention (8.15). Keep 
in mind that we consider (11.5) with any complex numbers a rs supported in 
the dyadic box 

(11.6) R < r ^ 2 R and S < s < 2S . 

We define the local variance 

(11.7) E (d) = E I EE (£) I 2 ■ 

a(mod d ) rs=a( mod d) 

Note that E{d) is not restricted to primitive classes. We have 

(H.8) E(d) = a r lSl a r2S2 (p^r) ■ 

f i s i =f2 S2 (mod d) 

Applying additive characters we can also write 

o(modd) r s 



THE POLYNOMIAL X 2 + Y 4 CAPTURES ITS PRIMES 


989 


For the variance E* (d) restricted to the primitive residue classes one can apply 
multiplicative characters getting 

(n.io) E *(d) = -L £ I 1 

x( modrf ) r s 

Our aim is to estimate the global variance 

(11.11) v(D)= y. E I EE <*•>(£) f ■ 

D<d^ 2 D tt(modd) rs=a(mod d) 

If the variables r, s are separable in the sense that the coefficients a rs factor as 

(11.12) a rs = /3 r 7 s , 

or they are a linear combination of such things, then an application of the 
large sieve for character sums could be contemplated (cf. [BFI]). But nothing 
like (11.12) holds in our case! The absence of such a factorization and the 
presence of the Jacobi symbol necessitate new ideas to pursue the goal. Further 
comments on this point are made at the end of Section 16. 

In this section we establish a basic estimate for V ( D ) which will be ex¬ 
tended in the next two sections and then summarized in Section 14. These 
four sections can be more or less considered as an independent unit. For this 
reason we shall feel free to use V and later W with meanings differing from 
those in Section 10, to which we shall later return. 

We express our estimates in terms of the I^-norm of the vector a = (a rs ); 

(11.13) INI 2 =J2J2\ a rs\ 2 • 

r s 

Proposition 11.1. Let D,R,S ^ 1. For any complex numbers a rs 
supported in the box (11.6) we have 

(11.14) V(D) < {D-3.RS+ (Dy/RS + RS* + SR*') (J?S) e } ||a|| 2 
with any e > 0 and the implied constant depending only on e. 

Remarks. We have the trivial bound (use (11.8) and Lemma 11.2 below) 
£(d)< (d-iRS + y/RS) ||a|| 2 . 

Hence 

V(D) < (RS + D'/RS) |H| 2 . 

Therefore the improvement in (11.14) appears in the first term. This first term 
D~zRS is not weakened by ( RS) £ , so it gives a nontrivial bound for all but 
very small moduli. One cannot do better because the moduli d ~ D which 
are squares contribute to V(D) at least D~iRS. However if we restricted d to 
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squarefree numbers then our argument would give D~ 1 RS in place of D~^RS. 
The second term D\fRS is fine for any D slightly smaller than \JRS. The last 
two terms in (11.14) can probably be improved by refining our treatment but 
there is no reason to do so. These two terms contribute less than the trivial 
bound as long as R and S have the same order of magnitude in the logarithmic 
scale, that is if R 2> S £ and S 3> R £ . 

We precede the proof of Proposition 11.1 with three easy lemmas. 

Lemma 11.2. For ( a,b,d ) = 1 the number ATd{R, S) of solutions to 
ar = 6s(modd) in positive integers r < R and s < S satisfies 

(11.15) A f d (R, S ) < d-'RS + s/RS . 


Proof. Dividing the congruence by a = (a, d) and (3 = ( b , d), then counting 
the solutions in two ways we infer the bound 


mm 


R fSfi 


P\d 


+ 1 


S (Ra 

~d 


a 


+ 1 > < d l RS + min(i?, S ) 


which yields (11.15). 

Lemma 11.3. For e ^ 1 and Q,R ^ 2 we have 
(ii.i6) E f E b) I 2 <C min {r(e) 2 Q 2 , QR + R 3 } (log QRfi 

q^Q r^R ^ ' 
q'^O (r,e)=1 


□ 


Here, and hereafter, q' fin means that q r is not the square of an integer. 

Proof. By the Polya-Vinogradov bound the inner sum is O (T(e)y/qlogq) 
giving the first estimate of (11.16). To get the second estimate we ignore the 
condition q' fi □, then we square out and change the order of summation 
obtaining 

f dpA | 


EEIE 


7*1 7*2 


V t J 


The terms with r\r 2 = □ contribute 0(QR log R) by trivial estimation. The 
remaining terms contribute O (i? 3 (log Q)(log i?)) by the Polya-Vinogradov es¬ 
timate. Combining these contributions we derive (11.16). □ 


Lemma 11.4. 

(ni7) £E 

di,d2^D 


We have 
1 , 
[di,d2] 


E 


r^R 

(r,d 1 d 2 )=l 


d[d ' 2 


2 < {DR) £ 


for any e > 0, the implied constant depending on s. 
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Proof. The condition (r,(pep) = 1 is redundant unless diefe is even in 
which case it simply means that r is odd. Thus we can ignore this condition 
by inclusion-exclusion. Hence the sum is bounded by 

EEEcfcW-M E (^)l 2 

e^D r^R ' 12/ 

b[b' 2 ^n (r,e)=1 



q^Q r^R 
q'^O (r,e)=l 


for some Q < D 2 . By Lemma 11.3 this is <C ruin {Q. R + Q 1 i? 3 } ( DR) £ . 

3 

This bound yields (11.17), the worst Q being R?. □ 

Our treatment of V(D) goes via the dual sum 

(11.18) W(D) = EEIE E '7 ad I 

r s d a=fs(modd ) 

Here r,s,d,a run over the same ranges as in V(D) and 7 a d are any complex 
numbers. By the duality principle familiar from the theory of the large sieve 
(see, for example, page 32 of [Bo2]) the estimate (11.14) is equivalent to 

(11.19) W(D) < (DVRS + RS* + SRi') (i?5) £ J || 7 || 2 . 

Now we are going to prove (11.19). First we enlarge W(D) by attaching a 
smooth majorant f(s ) such that 

f(s) > 0 , f(s) > 1 if S < s < 25, 

m = 2S , f(t)<S(l + \t\S)~ 2 . 

Then we square out and change the order of summation getting 

W < X X X la dl iad 2 XX (dhf) 

d 1 d 2 a(mod q) rs=a(mod q) 12 

where q = [d\,d- 2 \ is the least common multiple of d\ and d 2 . Here and from 
now on, often without writing it, we assume that R < r ^ 2i?. 

The terms with d^d^ = □ contribute 

WD = XX XX la ldl ia 2 d 2 Y, X 

d' 1 d ' 2 =\2 ai(moddi) rs=ai(moddi) 

C12 (mod d2 ) rs=a2 (mod d2 ) 

«XX X l^aidil 2 XX • 

d'id' 2 =n\ ai(moddi) fs=a\ (mod d\) 
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Hence by Lemma 11.2, 

w D < (d-'rs + Vrs ) E-w E l7aidi| 

d\ ai(moddi) 


where v(d\) is the number of c ?2 ~ D such that d\ d' 2 = □. We have v(d\) <C VD 
(or even better v(d\) <C 1 if d\,do, were squarefree) giving 

(11.20) W D < (p-%RS+>jDRS ) hf 

where 

m 2 = E E i-r^i 2 . 

d a(mod d) 

Next we estimate the contribution W () of terms with d\ d 2 ^ □. By 
Cauchy’s inequality 

(11.21) \W < >\ 2 ^a( 1 )^2^2w(d l ,d 2 ) 

d[d' 2 ^D 

where 

= 2 S 2 I 'ladi OW 2 I ^ 11T11 i 

d\ d2 a(mod q) 


and W(di,d, 2 ) is a local variance to modulus q = [rfi,^]? namely 

w (*■*)- E I EE • 

a(modg) rs=a(mod q) 1 z 


By Poisson summation for the sum over s we have the Fourier expansion 


J2 /(») 

s=ar (mod q) 



Hence, by grouping terms according to the product hr we get 




a(mod q) 


y jCk e 

k 


ak 

q 


where 


C/c — 


EE/ 

hr=k 

(r,q )=1 


d'A 


Next, by the popular inequality |a: + y| 2 ^ 2|cc| 2 + 2|y| 2 and the orthogonality 
of additive characters 

2 2 __a 

W{di,d 2 ) < — |c 0 1 2 + - c fcl c fc2 . 

k\=k,2 (mod <?) 

&1&2/0 
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The zero coefficient is the real character sum 



with /(0) = 2 S and the other coefficients satisfy <C r(k)S (1 + \k\S/qR) 2 , 
by a trivial estimation. Hence we infer that 

W(d 1 ,d 2 )Cq~ 1 S 2 | ]T (-£-\ f +(R + S)R(qR) £ . 

(r,q)=1 ^ 1 2 ' 

Summing over d\, d 2 we obtain by Lemma 11.4 that 

(11.22) W° < (SR* +DR + D^RS') (ZLR) £ || 7|| 2 . 

Adding (11.20) and (11.22) we get 

(11.23) W(D) < {d~*RS + (SR* + RS% + DsfRS + DR) {RS) £ } hf . 

Here we have added the extra term RS 3 / 4 to gain some symmetry, and we 
replaced ( DR) £ by ( RS) £ because if D > RS the estimate (11.23) is trivial. 
To complete the proof of (11.19) it remains to remove the term DR in 

(11.23) . First we look at the sum W*(D) reduced by the condition (a, d!) = 1, 
in other words W*(D) is the sum W(D ) for vectors 7 a d with ( a,d'■) = 1. 
Clearly, if we switch r with s and also change to 7 ad(-^r) then W*(D) 
is not altered. Therefore due to this symmetry we may assume that R < S. 
Applying (11.23) we get 

(11.24) W*(D) < {d-^RS + (, SR 2 + RS~* + D^RS^ (i?S) £ } || 7|| 2 . 

Now we deduce the same bound for W(D). To this end, we transform W(D) 
as follows 

w (- 0 ) = X] ^ I iea ’ ed (^7) I 

r s e \s' (d',a)=1 

a=r s/e (mod d) 

«EEEE e °Mi EE -w(^)i 2 

r s e \s' f\s' (d',a)=1 

a=rs/e (mod d) 

«EE^)'T“E E I EE 

e f r~R s~S/ef (d',a )=1 

a=rsf( mod d) 

Here we have written 1 = e“e -Q with 0 < a < -7 which yields a factor e a f~ a 
by applying Cauchy’s inequality. This factor is needed for technical reasons, 
namely to make one term in the following bound free of the divisor function 
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while losing only slightly in the remaining terms. Of course, such a refinement 
is not crucial but it will simplify some forthcoming work. Now applying (11.24) 
we get 

W(D) < 

e f (d,a )=1 

{{Df)~^RS + (dVrS + RS~* + SR 2 ) (RS/eff^ . 

We choose a = so the series for / converges and the function r(e)e“ _ 2 is 
bounded. By the above estimates, and since 

= IMI 2 ’ 

e (d,a)=1 

we obtain (11.19) for W{D ) as claimed. Finally, by the duality principle, 
(11.19) implies (11.14) for V(D). □ 


12. Flipping moduli 


The bound (11.14) is nontrivial in the range 

(log2 RS) a < D < (RS)^~ e . 

In this section we leapfrog by reflection over the middle to cover the range 
{RS)^ +£ < D < RS{\og2RS)~ A . 

For technical convenience we assume that our vectors a = ( a rs ) satisfy 
(12.1) (r, 2s) = 1 . 


Furthermore by splitting into four residue classes we can assume without loss 
of generality that r is fixed modulo eight. We have by the reciprocity law 



where ± depends on d and on r (mod 8) but not on r in any other fashion. 
Therefore V ( D ) for our vectors can be written as 


( 12 . 2 ) 


DO) = E E I EE (;) I 2 

D<d^2D a(mod d) rsEa(modti) 


Proposition 12.1. Let D,R,S ^ 1. For any complex numbers a rs 
with (r, 2s) = 1 supported in the box (11.6) we have 

(12.3) V(D) < C(D,R,S) EE-mi*. 


2 
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where C(D,R,S) satisfies the bound 

(12.4) C(D,R,S) <£D + D^(RS)%{log2RS) 4 

+ Id-^RS) 3 * + RS* + ( RS) £ , 

with any e > 0 and the implied constant depending only on e. 

Proposition 12.1 will be derived from Proposition 11.1 by way of inter¬ 
preting the congruence ns 2 = 02*1 (mod d) as riS 2 = '/"gsi (mod k) where k 
is the complementary divisor. The idea is reminiscent to that used by C. 
Hooley [Ho] in a similar context (the Barban-Davenport-Halberstam theorem), 
however the presence of Jacobi symbol makes our case quite subtle since the 
reciprocity law is essentially employed in the process of flipping moduli. There 
will be some complications related to various co-primality conditions. In order 
to distribute the burden of complications evenly throughout the proof we are 
going to prepare the following variation of Proposition 11.1. 

Proposition 11.1*. Let m.,D,R,S > 1. For any complex numbers 
a rs and j3 rs supported in the box (11.6) we have 

(12.5) 

H I a risiPr 2 s 2 (~r) I ^M{mD,R,S)\\a\\\\P\\ 

D<d^.2D riS2=r2Si(moddm) ^ ^ 

(ri,r 2 )=l 

where M.(D, R, S) satisfies the bound 

M(D, R, S) < D~^RSlog2R + (Ds/RS + RS* + SR^){RS) £ 
with any e > 0 and the implied constant depending only on e. 


Proof. As compared to the local variance (11.8) the bilinear form in (12.5) 
has a few extra features. To handle the extra modulus m we write £ = dm 
and note that ( r\r 2 ,m ) = 1, so that we can separate = (yypy) (nrj)’ 


and then transfer 


to the coefficients a. 


ri«i Hr2S2 • 


The new modulus £ 


runs over the segment mD < £ < 2mD and satisfies £ = 0 (rnodm) but we 
shall ignore the latter property by positivity. Then we remove the condition 
(?’i, r* 2 ) = 1 by Mobius inversion. After these transformations we see that the 
sum (12.5) is bounded by 

EE E I EE <v.(v)ll EE 

p <~mDa(mod<) rsEa(modl) ' rsEafmod <) ' 2 


If we denote, for a given p, the above inner multiple sum by L(a, (3 ) then, on ap¬ 
plying Cauchy’s inequality, L 2 (a,/3) < L(a, a)L(/3, /3) . Now, Proposition 11.1 
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is applicable to L(a, a) giving 


L(a, a) <C 


f RS 
l pVmD 


+ 


mDVRS + RSi+SR* 


(w EE 


a 


prs | 


Similarly L(/3,/3) satisfies the same estimate with a prs replaced by /3 prs . From 
these two estimates one arrives at Proposition 11.1* by summing over p. □ 


Now we proceed to the proof of Proposition 12.1. We begin by attaching 
to V(D) a smooth majorant f(y) supported on -y D < y ^ 3D, then we square 
out getting 

(12.6) V(D) < E /w EEEE ^■riSi^V2S2 ^ ^ • 

d riS2=r2Si(modd) 

Next we remove the terms near the diagonal. To do this smoothly we use the 
function g(x) whose graph is 



S/HR 2S/HR 2 S/R 3S/R 


with H to be chosen later subject to 1 < H < RSD 1 . Notice that for ri,si 
and r 2 , S 2 in the box (11.6) we have 

■ si s 2 I 2 S 

r\ r 2 R 

so the factor g ^ ^ may be inserted into (12.6) without alteration, 

except for the points ri, si and r 2 , s 2 with 

, si s 2 | 2 S 

'i r 2 HR 

The contribution of these exceptional points is estimated trivially by 

V 0 (f,g) < -D||a|| 2 + EE' y ( r ’ s )l“ rs l 2 

r s 

where the first term comes from the points exactly on the diagonal ris 2 = r 2 si 
(note that this equation implies r\ = r 2 and si = s 2 by virtue of (12.1)), 
and the second from the rest. Here v(r, s) is the number of integers k with 
1 < k < 8 RS(DH)^ 1 and rationals s\/r\ such that 

I —-I < —— and ?’is = rsi (mod k ) . 

1 n r HR 
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Each k is obtained by flipping d to the complementary divisor of | r\s — rsi|. 
Given k one shows, as in the proof of Lemma 11.2, that the number of rationals 
si/ri satisfying the above conditions is z 4 -(r, s) <C RS(Hk)~ 1 + VRS. Summing 
over k we obtain 


u(r, s ) <C 


RS 

If 


, 2 RS\ RS r— 

log Dw) + m' /irs ■ 


This yields 


(12.7) 


V 0 (f,g) < [D + H~ l RS\og2RS + H~ l D~ l R 2 S 


O' 


The remaining points contribute 

vu,g) = '£m EEEE 

d riS2=V2Si(modd) 


&riSl &V2S2 


r 17*2 


, , Si 52 

5- 

ri r 2 


We reduce the variables rr,r 2 by the common divisor c = (ri,r 2 ) and remove 
the resulting condition (c, d) = 1 by Mobius inversion getting 


(12.8) V(f,g) = EE n{m)V crn {f,g) 

c m\c 


where V cm (f,g ) is the sum 

(12.9) ]T f(dm) J2J2J2J2 acr lsl acr 2S2 (J^) 9 Q 

d ns 2 =r 2 si(dm) 

(ri,r 2 )=l 


Si 

n 



For every c and m\c we estimate V cm (f,g ) separately. Here we flip d to the 
complementary divisor of |ns 2 — r 2 si|m _1 . We write |ris 2 — r 2 si| = dmq and 
interpret each and every term involving d in terms of q. We have 


/ (dm) = f ^ 

and using the reciprocity law we get (recall that r\, r 2 are co-prime, odd, and 
congruent modulo eight) 


£i _ £2 | fir2 
ri r 2 q 


dm \ = ffi Wf 2 \ ( q \ 
rif2) \nj \f2) \r 1 r 2 J 
This transforms V cm (f,g) into the sum 


( 12 . 10 ) 


EEEEE^a... 

q r\s 2 =r 2 s\(mq) 

(ri,r 2 )=l 


-) B (- (- 

rif 2 j \c \fi 


s£\ crir 2 \ 
f 2 ) q ) 


where 

( 12 . 11 ) 


B{x,y) = f(\x\y)g(\x\) 
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and 



Next we separate the variables q,ri,s\,r 2 ,S 2 involved in B(x,y) for the 
x, y relevant to (12.10). Before doing this we observe that q runs over the 
segment 


( 12 . 12 ) 


RS 

cDH 


<q< 


24 RS 
cD 


which fact follows by examining the support of g. Having recorded this in¬ 
formation we represent B(x,y ) as the Fourier-Mellin transform in x and y 
respectively, 


(12.13) 


f(\x\y)g(\x\) 


h(u,t)e(ux)y lt dudt . 


Here we have by the inverse transform 
1 


h(u,t ) = 


2vr _ 

i r°° 


IT 


'0 


f{\x\y)g(\x\)e(-ux)y lt dxdy 

poo 

x' lt g(x) cos(2irux)dx / f{y)y i ~ lt dy. 


Integrating the Mellin transform by parts four times we get 

poo 

/ f{y)y~ l ~ u dy < (t 2 + I)” 2 , 

Jo 

and integrating the Fourier transform by parts up to two times we get 

S 1 HR' 


x u g(x) cos(2irux)dx <C (i + 1) min 


log 2 H . 


r ’ \ u y u 2 s _ 

From these estimates it follows that the Li-norm of h(u,t) satisfies 


(12.14) 


\h(u,t)\dudt <C (log2 H) 2 . 


Applying (12.13) to (12.10) we infer that 
(12.15) V cm (f,g) « (log2Ff) 2 ^ 

<1 riS2=r2Si(mq) 

(ri,r 2 )=l 



where 


(12.16) 

for some real u and t 
remains unchanged). 


At 


{US 


lrs = r e — - a crs 

\cr / V r / 

(the ~ denotes complex conjugation except for r lt which 
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At last we are ready to apply Proposition 11.1*. This gives, by (12.12), 
(12.15) and (12.16), 

V cm ( f, g) < { (cm)~7(DHRS)i (log 2RS) 3 

+ c-*mD- 1 (RS)7 +RS* + SR* ( RS) £ }EEk-,i 2 . 

r s 

Summing over m and c as in (12.8) we obtain 

(12.17) V(f,g)^H(D,H,R,S ) EE*>w’ 

r s 

where TL(D , H , R, S ) satisfies the bound 

H(D, H, R, S) < (DHRS)^(log2RS) 4 + [d" 1 ^)! + RS* + Si?*] ( RS ) £ . 

Adding (12.17) to (12.7) and choosing H = (D~ 1 RS) 1 / 3 we complete the proof 
of (12.3). □ 

13. Enlarging moduli 

After the results of the last two sections we need a nontrivial bound for 
V (D) in the middle range 

{RS)^~ £ < D < (RS)^ +£ . 

We deal with these moduli by establishing an inequality between V(D) and 
V(D + ) for D + > D, thereby allowing us to appeal to the result for larger 
moduli given in the previous section. We accomplish this by considering, for 
each given d the special moduli dp 2 where p runs through a set of primes. 
Since the Jacobi symbol for the enlarged modulus dp 2 is essentially the same 
as that for d this gives us a multiple counting of the original sum. Without 
this multiplicity we would gain nothing and for this reason the method fails for 
general characters. Different arguments but of similar spirit appear already in 
the paper [BD] of Bombieri and Davenport. 

Let D, R, S ^ 1 and a rs be any complex numbers for (r, 2s) = 1 supported 
in the box (11.6). For any prime p we write 

EE (;) = EE (y) + EE (y) ■ 

rs=a(mod d) rs=a(modd) rs=a(modd) 

p\r p\r 

V(D)^2 ]T {E p (d) + E' p (d)) 

D<d^2D 


Hence we infer 
(13.1) 
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where 


E p (d) = E 

a(mod d) 


EE 

rs=a(mod d) 
P\ r 


2 


and E' p (d) is given by the same formula but with the condition p\r in place of 
p\r. We estimate the latter trivially as follows 


EW <E EEEE \^priSi^pT2S2 I 

d d riS2=^2Sl(mod d) 

«{D + p-HRS ) 1+£ } EEivf. 

r s 

Multiplying (13.1) through by p~ 1 logp and summing over P < p ^ 2 P we 
obtain 

(13.2) V(D) < E ^ E E p^ + (P? -1 + ^ _2 -R5)(i25) e ||a|| 2 . 

p^ c> l 5 .n 

In general we have the 

m = 

We apply this with m 

(13.3) E p {d) < p 2 E(dp 2 ) 

since (—) = ((() for p\r. Inserting (13.3) into (13.2) we obtain the following 
Principle of Enlarging Moduli. For every D, P, R, S ^ 1 we have 

(13.4) V(D) < V(D + )P log DP + {DP- 1 + p-' 2 RS)(RS) £ \\a\\ 2 


following inequality (monotonicity of the local variance) 


E I E 

a(modd) n=a(modd) 


i E IE 


d 


a„e 


6(mod d) n 


bn\ ,2 

~d) 1 

bn \ ,2 


1 E !E a - e (^) f=mA(dm) . 


d , 

6(mod dm) el 

= p 2 to E p (d) getting 


for some D + with DP 2 < D + < ADP 2 and where the implied constant depends 
only on e. 


Actually D + may be taken to be 2 iDP 2 for one of j = 0,1, 2. Note that 
in the recurrent term on the right side we have enlarged the range of moduli by 
a factor P 2 while losing only P log DP in the bound. Combining Proposition 
12.1 with the above principle we infer 
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Proposition 13.1. Let D,R,S ^ 1. For any complex numbers a rs 
with (r, 2s) = 1 supported in the box (11.6) we have 

(13.5) V(D) < {d(R + S)^(RS)^ + D~%(R + S)*(RS)%} (RS) £ \\a\\ 2 . 
Proof. We can assume that 

(13.6) (R + S)*(RS)i (R + S)~*(RS )* 

or else the bound (13.5) is trivial. Applying (12.3) to V(D + ) in (13.4) we 
obtain 


F(T>)< { D^P^(RS)^+ (DP)~ 1 (RS)^ 

+P(R + 5)2 (RS) i + DP 3 + P~ 2 RS^ (Pi?S') e ||a|| 2 . 

We choose P = D~^(R + S)~^(RS)% getting (13.5). □ 


14. Jacobi-twisted sums: Conclusion 


We combine the results of Sections 11, 12, 13 to formulate a bound for 
V(D) which is nontrivial throughout the range 

(log 2 RS) a < D < RS (log 2 RS)~ a 
just as in the Barban-Davenport-Halberstam Theorem. 

Proposition 14.1. Let D,R,S ^ 1. For any complex numbers a rs 
with (r, 2s) = 1 supported in the box R < r < 2 R, S < s < 2S', we have 

Y Y I YY ars Q) | 2 ^ Af{D,R,S)^2^2T(r)\a rs \ 2 

D<d^2D a(mod d) rs=a(modd) r s 

where M(D , R, S) satisfies the bound 


A f(D, R,S) ^D + D-^RS + D^ (RS )§ (log 2 RS) 4 + (R + S )ra (RS)^ +£ 

for any e > 0, the implied constant depending only on e. Here (^) is the Jacobi 
symbol if d is odd and is extended for d even by (8.15). 

Proof. This follows by application of Proposition 11.1 if D ^ D\, Propo¬ 
sition 13.1 if D\ < D < D 2 , and Proposition 12.1 if D ^ D 2 , where 

D 1 = (R + S)T2(RS)^ and D 2 = (R + S)~*(RS) ! b . 
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15. Estimation of V(/3) 


Now we are ready to estimate V(f3) which is the partial sum of (10.13) over 
the moduli d in the middle range as defined in (10.16). We write z\ = r\ + isi 
and Z 2 = r 2 + is 2 so A(zi, 2 2 ) = r\S 2 — r 2 S\ and the congruence condition 
A(zi,Z 2 ) = 0 (mod4cl) reads as 


(15.1) r\S 2 = V 2 S\ (mod 4c?) . 

For any p\A we have 

Z 2 /Z 1 =r 2 /n ii p\r\ , 
z 2 /zi =s 2 /si if p f si . 


Since (n, si) = 1 we always have one or both cases. Moreover we have ( d , n) = 
(d,r 2 ) = e, say. Dividing (15.1) by e and changing notation to remove the 
factor e from d, ri,r 2 we infer that V(/3) is given by 
(15.2) 


! ^ 

( zi , z 2 )=l 


P. 


Z2 


EE 

ed>X 

r\S2=T2Si(4d) 


«2) 


(p(ed) /sis 2 \ (r\r 2 


ed 


d 


log 2 


2422 , 

A 1 


where now z\ = er\ + isi, z 2 = er 2 + is 2 . 

Our next goal is to separate the variables z\, Z 2 - The condition (zi, z 2 ) = 1 
can be removed precisely by the Mobius formula, however we take advantage 
of the restriction (5.8) which allows us to ignore this condition at the admis¬ 
sible cost O (P^IV 2 ). To get it at this cost use Lemma 2.2 with k = 4 to 
reduce d before estimating trivially. Here one naturally loses a few logarithms 
coming from the structure of the arithmetic functions involved but these are 
compensated by the saving of a large number of logarithms due to the size of 
the box so the bound is actually smaller than stated. This remark, which we 
shall not repeat, will later apply to a few ‘trivial’ estimates of a similar nature. 

To separate the variables z\, Z 2 constrained by /(| A|/ec?) we use the same 
technique as in Section 12. We employ the function g whose graph is 



1/2 N UN 



Put B(x,y) = f(\x\y)g(\x\). The cutting factor f(\A\/ed) can be replaced in 
(15.2) by B(x,y) at the special points 


, = £1 _ £2 
r\ r 2 


\rir 2 \ 


(15.3) 


and y 


d 
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Now, using the Fourier-Mellin transform of B(x,y) (see (12.13)), we separate 
the variables at the cost of the L i -norm of the transform which is bounded by 
0((loglV) 2 ); see (12.14). 

To separate the variables in the logarithmic factor 

(15.4) log2 |^i^ 2 /A| = - log sin(a 2 - «i)|, 
see (7.2), we use the Fourier series expansion 

(15.5) — h(a) log sin a\ = . 

I 

Here we have attached a “modifier” function h(a) for the purpose of accelerat¬ 
ing the convergence. We assume that h(a ) is even, periodic of period 27r and 
vanishing at a = 0. There are many good choices. For our purpose it suffices 
to take 

(15.6) h(a) = min j || — 1| N, lj . 

We have 

(15.7) \c(\ « (log N) 2 . 

I 

Note that for a = a 2 — ol\ the mollifier h(a) does not alter the value (15.4) 
because irN ^ 1 < |a| < it. Inserting the Fourier series (15.5) in (15.2) we 
achieve the separation of variables at the cost of 0((log IV) 2 ). 

After the separation of variables we get by (15.2) 

V(P) « (log IV) 4 ]T]T E I EE ZersPer+is Q f +P~'N 2 

X<ed<Y a(mod d) ar=s(Ad ) 

where Y = NX^ 1 (this limit was redundant before the separation of variables) 
and £ ers are complex numbers with |£ ers | = 1 (these are character values coming 
out of the separation process). Given e we apply Proposition 14.1 to the sums 
over d, a, r, s with D, R, S such that X < eD < Y, eR < \/2N and S < y/2N, 
showing that V{(3) is bounded by 

(log N) 8 {x-^N + Y^N% + Tz{r)\(3 r+is \ 2 + P l N 2 . 

r s 

Here we estimate |/3 r +is| by r(?’ 2 + s 2 ) but retain the range of summation in 
the box (5.14). Then we relax r(r 2 + s 2 ) by applying Lemma 2.2 so that the 
summation over s can be executed with sufficient accuracy. Finally summing 
over r we arrive at 

££ r 3 (r)|/3 r+ i S | 2 < 6 2 N(logN) 199e . 


Hence we conclude 
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Proposition 15.1. For 1 < X < N& we have 
(15.8) V(P) < (P -1 + X~^ N 2 . 


16. Estimation of U{(3) 

In this section we estimate U (/ 3 ) using classical methods of prime number 
theory in the Gaussian domain. We write 

(16.1) U(0) = 2 d~\{d)U d {(3) 

d^X 

where 

(16.2) UM= VV lo S 21^|- 

A(2i,22)=0(mod 4 d) 

Here the moduli are small and the source of cancellation will be the sign change 
of the Mobius function which is part of (3 Z . Therefore we can afford to estimate 
U d (/3) individually for every d < X. In view of the multiplicative structure of 
/3 Z it is natural to employ Hecke characters. 

First note that the condition A(zi,Z 2 ) = 0 (mod4c?) is equivalent to 

(16.3) z\ = UJZ 2 (mod 4c?) 

for some rational residue class uj (mod 4c?). Thus we can write 

(16.4) UM= E Q EE /(^)AA log2|if |. 

o>(mod4d) ( 21 , 22 )=! 

z\ =luz 2 (mod 4 d) 

Now we remove the condition (^ 1 ,^ 2 ) = 1 which costs us 0(d~ 1 P~ 1 N 2 ) by a 
trivial estimation using (5.8). Moreover by a trivial estimation we can remove 
the terms of (16.4) near the diagonal, say those with |«2 — ot\\ < 2irH~ 1 , at 
the cost of 0(d^ 1 H^ 1 N 2 ). This can be done smoothly by means of a function 
h(a) essentially as in (15.6); here specifically 

(16.5) h(a) = min j || — \\H, lj . 

We obtain 

(16.6) U d {/3 ) = 0 ) YU2 PziPz2 U ( a i ~ “ 2 ) 

a; (mod Ad) z± =UZ 2 (mod Ad) 

+ 0 (d- 1 ^- 1 + H^)N 2 ) 
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where (see (7.2)) 

(16.7) u(a) = — h(a) log sin a\ . 

Note that the factor f(\A\/d) is not needed in (16.6). Next we expand u(a ) 
into its Fourier series 

u(a) = ^2u(k)e ika 
k 

which converges quite rapidly. Indeed, we derive by (16.5) that 

(16.8) u{k) <C (1 + k 2 H~ 2 )~ l log H . 

Hence we can truncate the Fourier series with a small error term, namely 

(16.9) u{a) = Y u{k)e ika + 0{K~ 1 H 2 log H) . 

\k\^K 

Inserting this in (16.4) we get 

(16.10) u d (p) = y «(*) 0 J2J2 

u;(mod4d) zi=ujz 2 (mod Ad) 

+ 0(d- 1 (p- 1 + H~ 1 + K- 1 H 2 )N 2 ) . 

Now we detect the congruence (16.3) by multiplicative characters of the 
group (Z[i]/4dZ [?'])* getting 

( 16 - u > E G) EE&A(tS)* = Gdj E^WI^OI 2 

CJ Z\=UZ2 ' ' X 

where 4?(<i) is the order of the group which is given by 

(16.12) $(d)=4^(4d)J]fl-^M) , 

/A p / 

and where 

(16-13) J( X ) = £ xU) (|) - 

o; (mod4d) 

Note that the sum 77(x) is incomplete because u runs over the subgroup of 
rational classes modulo 4d whereas x is a character on the group of all classes 
modulo 4 d in Z[iJ. For every such x in (16.11) we have the character sum 

(16.14) S k (/3) = YPMz){z/\z\) k • 

Z 

Inserting (16.11) in (16.10) and the latter in (16.1) we arrive at the formula 

(16.15) i m = y. *(*) E Jfzl E 

d^X %(mod4d) 

+ 0 ((P- 1 + FT 1 + K^H^N 2 log N) . 
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It remains to estimate the character sums S^(/3). To this end we shall 
employ the theory of Hecke L-functions. The foundation of this theory was 
described by Hecke in the setting of a general number held but we shall take 
advantage of simplifications that come with the Gaussian held. Note that for 
m £ Z and x a multiplicative character on the residue classes modulo 4 d in 
Z[i\ the function 

(16.16) 'ip(z) = x(z)(z/\z\)' n 
defines a character on odd ideals of Z[i] by setting 

ip(a) = ip(z) 

where z is the unique generator of n which is primary. 

We say that the Gaussian integer z is odd if it is co-prime to 1 + i. Recall 
from (5.3) that we say z is primary if z = 1 (mod 2(1 + ?')) (see also (5.4) and 
(5.5)). Every odd z is conjugate to exactly one primary integer; we denote 
this by z. The only primary unit is z = 1. The product of primary numbers is 
primary and every primary number factors uniquely (up to permutation) as a 
product of primary numbers which are Gaussian primes. 

To the character (16.16) we attach the L-function 

(16.17) L(s,ip) = ^ip(a)(N a)~ s . 

a 

Since ip is completely multiplicative L has the Euler product 

L (a ^)=n( i -^)(^r)" 1 • 

p 

This has a meromorphic continuation to C, is entire apart from a simple pole 
at s = 1 in case of trivial ip , which happens only if m = 0 and x is trivial. 
There is also a functional equation for x primitive proved by Hecke: 

( s + 7 M) L (s,^P) = W (^) 1 A r(l -8 + \\m\) 1/(1 - 8, ip) 

where W is a Gauss sum normalized so that \W\ = 1. By the functional 
equation one derives, by applying the Phragmen-Lindelof principle, crude but 
sufficient upper bounds for L(s,ip ) and L\s,ip ). Namely we have 

(16.18) L(s,ip) < (d(\s\ + \m\)) 1 ~ a (log(4d(|s| + |m|))) 2 , 

(16.19) L'{s,ip) < (d(\s\ + |m|)) 1_CT (log(4d(|s| + |m|))) 3 , 

in the strip ^ < a ^ 1 where the implied constant is absolute. 

We also need a zero-free region for L(s,ip). It follows from the above 
bounds by classical arguments that there are no zeros for s = o + it with 

(16.20) a > 1 — c/log(4d + \m\ + |f|) 
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where c is a positive absolute constant, apart from a possible exceptional simple 
real zero in the case that ip is real. 

If ip is real then m = 0 so ^ is a real character on residue classes on Z[i] 
to modulus 4 d. One such character is given by 

Y>( a ) = Xq{Na) 

where \ q is a real Dirichlet character to modulus q = 4 d. In this case L(s, ip) 
is the product of two Dirichlet L-functions 

L(s, 1 p) = L(s,Xq)L{s,X4Xq) ■ 

This product is the automorphic L-function associated to a certain Eisenstein 
series for the group To((/). There are a few other real characters on the group 
Z[i\/qZ[i\, which will be studied in Section 19. These give L-functions of cusp 
forms of weight one, level q and the central character X 4 Xq- Applying Siegel’s 
method one shows 

Lemma 16.1. For any 0 < e < \ there exists a positive constant 
c(s) such that for every real character ip(mod4:d) on Z[i] the Hecke L-function 
L(s,ip ) does not vanish in the segment 

(16.21) s > 1 - c{e)d~ £ . 

Proof. We follow the simplified version of D. Goldfeld [Go] which applies 
in an extremely general context for L-functions attached to real characters. 
The constant c(e) is not computable from this proof nor from any other proof 
known to date. 

Let ipi (mod4c?i), ■02(mod M 2 ) be real characters on Z[i] such that ipi, ip 2 , 
and ipiip 2 are nontrivial. We consider the product of L-functions 

L(s) = C K (s)L(s,ipi)L(s,ip 2 )L(s,ipiip 2 ) 

where (k(s) = C( s )A(s, X 4 ) is the zeta-function of Z [?']. By the Euler product 
we see that the Dirichlet series for L(s) has nonnegative coefficients, specifically 

00 

L(s) = exp^^ ^1 + ipiip^ (l + ip 2 (p k ) S ) ( Npy ks = J2 a n n~ s 

P k 1 

where ai = 1 and a n ^ 0 for all n. At s = 1 there is a simple pole with 

7r 

ves s=1 L(s) = —L(l, ipi)L(l, ip 2 )L(l, ip\ip 2 ). 

On the critical line we have by (16.18) 

L(s) < (did 2 \s\) 2 . 
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Hence we derive by contour integration that 

~ 1 

2iri 


Y^a n n-Pe- n /y = 


1 ( 2 ) 


L(s + /3)T(s)y s ds 


= —1/(1, ^)L(1, ip 2 )L( 1, ipiip 2 )V(l - P)y 1 -' 3 
+ L(P) + 0{dl4y$-P) 

where | < /3 < 1 and the error term bounds the integral on the line a = (5- 

On the left side the series is bounded below by its first term e _1 / y > 1 — y -1 . 
Therefore we have 

f L(l,ipi)L(l,ip 2 )L( 1, V’iV , 2)r(l - /% 1_/3 + L(/3) >1 + 0 . 

Take f3 = /3 i any real zero of L(s, V’l) in | + /3i < 1 if it exists. Then L(/3 1 ) = 0 
and, choosing y = c(did 2 ) s where c is a large absolute constant, we get 

L(l,ipi)L(l, ip 2 )L(l, Y+ 1P2) > (1 - / 3 i)(cW 2 )~ 8(1_/3l) - 

Since L(l,Vh) (log4di) 2 and L(l, ip\ip 2 ) <C (log4did2) 2 by (16.18) we con¬ 
clude that 

(16.22) T(l,^ 2 ) > (1 - Pi)(did 2 )~ 8( ' 1 ~^ 1 \\og4d\d 2 )~ 4 


where the implied constant is absolute and can be effectively computed. 

Now we are ready to complete the proof of the lemma. We argue as follows. 
Fix 0 < £ < j. If, for every real character ip, the L(s, ip) does not vanish in the 
segment s > 1 — e then we take c(e) = s getting the result. Suppose then there 
exists a real character ip for which L(s, ip) vanishes at some point s > 1 — e. 
Fix one such, say ip = ip\ and let (3\ denote the largest such zero. Let ip 2 be 
any nontrivial real character on Z[i\. If ipiip 2 is trivial then L(s,ip 2 ) has the 
same zeros as L(s, ipi) in s > In this case we take c(e) = 1 — (3\ getting the 
result. If ipiip 2 is nontrivial then (16.22) yields the lower bound 

L(l,ip 2 ) > <^2 8e (l°g 4d 2 ) -4 

where the implied constant is effectively computable in terms of d\ . On the 
other hand, if /3 2 is a real zero of L(s, ip 2 ) in s > 1 — £ then by the mean-value 
theorem and (16.19) we get the upper bound 

L(l,ip 2 ) = (1 - (3 2 )L'(s,ip 2 ) < (1 - /3 2 )dl(log4d 2 ) 3 . 

Combining the upper and lower bounds we infer that l—(3 2 >> (Lf 10t completing 
the proof of Lemma 16.1 (just change lOe into e). □ 


The information accumulated in (16.18)-(16.21) suffices by classical argu¬ 
ments to establish the upper bound 

(16.23) L(s, ip)~ l < ci(log(|m| + \t\ + 3)) 2 
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in the region 

(16.24) er^l — c(e)/d £ log(|ra| + \t\ + 3) 

for any e > 0 where c(e) > 0 and the implied constant depends only on e. 
To obtain such a bound for 1/L we may begin with the corresponding bound 
for L'/L, integrate it to the right to obtain a bound for |logL| and then 
exponentiate the latter. 

Having the bound (16.23) we can now estimate each character sum 

(16.25) S k (fi) = ^2p z ip(z) 

Z 

by employing L-functions with Grossencharacters. Recall that /3 Z is given by 
(5.13) subject to (5.7) and (5.8) (the restriction for the number of divisors (5.9) 
was already waived after (10.4) had been established). Here the congruence 
(5.7) can be detected by characters to modulus eight so we can relax it by 
changing x i n V’j however we retain the property of z being primary so as to 
keep the unique correspondence with ideals. The remaining restriction (5.8) 
that ^ be free of small prime factors will be taken care of easily after (16.25) 
has been expressed in terms of L-series. 

To this end we expand the “angle mollifier” q(a) into its Fourier series 

q(a) = J2me Ua • 

t 

By the properties of q(a) recorded just before (5.13) it follows that the Fourier 
coefficients are bounded by 

(16.26) g(^)<0( i + e 2 e 2 )- 1 . 

Thus we can truncate the Fourier series with a small error term 

(16.27) q(a) = q(t)e ila + O)#” 1 !,- 1 ) . 

I e\<L 

This error term contributes to S k (/3) at most 0(L _1 A1) by a trivial estimation. 
Next we write the “radius mollifier” p(n ) as the Mellin transform 

p(n) = - / p(s)n~ s ds 

J(a) 

where ^ < cr < 2. Since p{n) is supported on the segment (4.12) and its 
derivatives satisfy (4.14) it follows that p{s) is entire and bounded by 

(16.28) p(s) < 6(1 + Oh 2 )- 1 N a . 

Applying the above formulas to the character sum (16.25) we get 

(16.29) S k x ((l) = ]T q(£)^~ [ p(s)Z k x +t (s)ds + 0(L~ l N) 
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where 

z x(s) = Y X{z){z/\z\) m n{n)^{n)n - s , 

(*.n)=i 

n = \z\ 2 and (see (2.12)) 

l(n) = Y ^ • 

c\n, c^C 

We write the series for Z™{s) in terms of ideals (and, for notational simplicity, 
delete the scripts x, m ) getting 

Z(s) = 'i/j(a)p(Na)'y(Na)(Na)~ s . 

(o,n)=i 

In the analytic theory of zeta-functions it is more convenient to work with 
Dirichlet series over rational integers. Therefore we group the ideals in ac¬ 
cordance with the norm and associate with the Grossencharacter the Hecke 
eigenvalue 

(16.30) A in) = Y V>(a) = Y X{z){z/\z\) m . 

Na=n zz=n 

From the Euler product it follows that 

Y A ^ )T " = (! - X (P) T + <p) t 2 Y 1 

V 

where e(jp) = 1 if p = 1 (mod 4) and e(p) = 0, otherwise. Now 

Z(s) = Y, \{n)p{n)x{n)n~ s 
(n, n)=i 

= Y* Mc)c~ s Y x ( n )pYY~ s 

c^C (n,cn)=l 

( C ,n)=i 

= n i 1 - mp)p ~ s ) Y b a ( c ) n - mp)p~ s t' c ~ s 

p>P c^C p\c 

(c,n)=i 

= M(s)P(s)Q(s ) , 

say, where 

M(s) = (l - A (p)p~ s ) , P(s) = (l - A {p)p~ s y 1 > 

P p^P 

and Q(s ) is the remaining sum over c < C. Since M(s)L(s,ip) is given by an 
Euler product which converges absolutely in Re s > \ we get by (16.23) 

M(s) <C d(log(|m| + \t\ + 3)) 2 


(16.31) 
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in the range (16.24). For the estimation of P(s) we assume in addition to 
(16.24) that 

(16.32) 1 - 1/log P s£ a < 1 . 

In this region |p _s | = p~ a < 3p -1 for all p ^ P so we have the trivial bound 

(16.33) P(s ) <C (logP) 3 . 

We also estimate Q(s) trivially by 

(16.34) Q(s) <C C l ~ a \ogC . 

Multiplying the bounds (16.31), (16.33) and (16.34) we get 

(16.35) Z(s) < dC ll_f7 (log(|m| + \t\ + IV)) 6 

uniformly in the intersection of the regions (16.24) and (16.32). 

Now we can estimate the integral 

(16.36) I = —— [ p(s)Z(s)ds . 

2vn. J (a) 

We choose T > \m\ + 3 and put 

(16.37) 6 = min {c{e)/d £ log T , 1/ log P} . 

We then move the contour of integration in (16.36) to the vertical segments 

s = 1 + it with |f| > T, 

s = 1 — 5 + it with \t\ < T, 

and to the two connecting horizontal segments 

s = cr ± iT with 1 — <5 < a < 1 . 

Integrating trivially along the above segments we infer by (16.28) and (16.35) 
that 

/ < c i(r ” 1 + (c/iv) < 5 )iv(iog(iv + r )) 6 

< d ( 'T~ l + N~^ N (log(lV + T)) 6 

since C < N 1_J? where r/ is a positive constant from Proposition 4.1. Choosing 
T = 3exp(y / IogTV) we get 

(16.38) I <C |exp [~’qc(e)d~ £ y^og N^j + exp | dN(log N) 6 

uniformly for \m\ ^ 2exp(Y/log IV). By (16.38) and (16.29) we conclude, after 
summing over t with \i\ < L = exp(\/log N ) and using (16.26), that the same 
bound (16.38) holds true for the character sum (16.25). We state this as 



1012 


JOHN FRIEDLANDER AND HENRYK IWANIEC 


Lemma 16.2. Let y be a character to modulus 4 d on Z[i\. Then 

S^(/3) < | exp (-r)c(£)d~ e y/log n) + exp | dN (log N) 6 

uniformly for \k\ < exp (i/log N), the implied constant depending only on s. 
Finally, by Lemma 16.2 and (16.15) we derive that U(/3) is bounded by 

|exp 2r]c(e)X~ £ ^/log N^j + exp ^—2 | HX 4 N 2 (log HN) 13 

+ (P” 1 + H~ l + K~ l H 2 ) N 2 log N 

provided K < exp(y / log N). We choose H = X, K = X 3 and restrict X by 

(16.39) log A <C log log N. 

Recall that P was already restricted by 

(16.40) logP <C (logIV)(loglog N)~ 2 . 

By these choices and restrictions the above bound for U(/3) becomes: 
Proposition 16.3. For 1 < X < (logfV)'' with any v we have 

(16.41) U{(3) < (P” 1 + X-^N 2 log N , 

where the implied constant depends on v. 

Remarks. The restriction (16.40) can be weakened slightly, replaced by 
logP <C (loglV)(loglog N)- 1 by an elementary but more sophisticated sieve 
method. However (16.39) cannot be weakened, given the current state of 
knowlege concerning exceptional zeros. For this reason the constant implied 
in (16.41) is not effectively computable nor is the one in our main Theorem 1. 
All the other arguments in this paper give effective results, such as Theorem 2. 

The formula (16.15) is valid in any range of the moduli. Therefore one 
could attempt to use this also in the middle range to give a direct treatment of 
V(P) by an appeal to the theory of the large sieve rather than by the synthesis 
of the three methods from Sections 11, 12, 13. However such an approach fails 
because it ignores the intrinsic nature of the factor J{x) (which is a partial 
character sum over the subgroup of rational residue classes given by (16.13)). 
The change of the argument of J (y) plays an important role. Note also that 
\Jix)\ can be as large as ip(d) if y is a real character (one of those constructed 
in Section 19), while the average value is y/<p(d)\ precisely we have 

(16.42) Pjy XI I J(x)| 2 = m . 

X (mod d) 
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17. Transformations of W((3) 


It remains to estimate the partial sum W{(3) of T(/3) over d > |A|/4X. 
In addition to this lower bound, 4 d divides the determinant A and, since d 
is extremely large it is useful to switch d into the complementary divisor of 
|A|/4. We get (see (10.13) and (10.17)) 


==>£/•(«> ee 


(zi,Z2) = l 
A(zi,Z2)=0(4d) 


^1^2 

A 


because ( = (jSfl ' • Here we write 


4 d_ 

TAJ^ V 4d 


|A| 


E 

4bd\A 


p(b) 


so that W (/3) becomes 


07.1) 2^'A^/-(4d) ££ (Hrrl) i°g 2 

b d (zi,Z2)=l 

A(2i,22)=0(46d) 


V|A|d 


Zl^2 

A 


Note that from our restrictions for the support of (3 Z (see Section 5) and the 
above summation condition we have 


(17.2) 


0*1, Zl) = (Z2,Z 2 ) = (zi,z 2 ) = 1, 


(17.3) 


z i = Z 2 (mod 8), 


(17.4) 


0 < nr 2 = 1 (mod 8) 


The positivity of r\r 2 means that the points z\,Z 2 lie in the same half-plane, 
but in fact they even both belong to the same fixed narrow sector (5.12). 


Lemma 17.1. Assuming (17.2)-(17.4) we have 
(17.5) 


( *2/zi\ 

V |A| ) 


gi W s 2 

\ri\J V|r- 2 | 

where A = A(zi, z 2 ) = ImA z 2 = r\s 2 — r 2 s\ for z\ = r\ + is\ and z 2 = r 2 +is 2 - 
Note that si,S 2 are even and (ri,si) = (r 2 ,s 2 ) = 1. 

Proof. We shall appeal to the quadratic reciprocity law 

*)(h 


(17.6) 


. a— 16—1 

= (-1) 2 2 (o, b) c 
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for a,b odd and co-prime, where (|) is the Jacobi symbol and (a, b)oo is the 
Hilbert symbol at oo defined by 


(17.7) 

We have z 2 jz\ 


(o, V) oo — ^ 


-1 
1 

(nr 2 + sis 2 + iA)|zi| 
(z 2 /z l \ _ f\z t \ 2 \ 


if a < 0, b < 0 
otherwise. 

2 , whence 


nr 2 + sis 2 


V l A l / V l A l ) \\riS2 ~ r 2 s 1 \J 


Put r\ = pui, r 2 = pu 2 with (■ u\,u 2 ) = 1, so {u\u 2 , sjs 2 ) = 1, u\u 2 = 1 (mod 8) 
and u\u 2 > 0. Putting |zi| 2 = r 2 + s 2 = p 2 ii\ + s 2 , we have 

(p 2 , uiu 2 + sis 2 )w 2 = uiu 2 \zi \ 2 + rtisi(ttis 2 - u 2 si), 


and so 


/ nr 2 + sis 2 \ _ ( sis 2 \ ( p 2 uiu 2 + sis 2 
\|ris 2 - r 2 si\) V P ) V \ u i s 2 ~ U 2 Si\ 

Since |zi| 2 is a square modulo p we get 


/ z 2 /zi 

V |A| 


SjS2 

P 


U\U 2 

UlS 2 - U 2 Sl 



-ft?) 


UlU 2 Zl 


[UlS 2 - U 2 S 1 


( SlS 2 \ / UlS 2 - U 2 S \ \ 

V p ) V U1U2 J 


by (17.6) (actually one has to pull out from the determinant the whole power 
of 2 before application of (17.6) and install it back after at no cost because of 
the convention (8.15) and the congruence u\u 2 = 1 (mod8)). Hence 


( 

V |A| J 

where 



by (17.6) using 0 < u\u 2 = 1 (mod 8). This completes the proof of (17.5). □ 


Inserting (17.5) in (17.1) we arrive at 


w(i3) = 2'£ ! ^ 1 '£rm vr 

b d (zi,^ 2 )=l 

A(zi,Z2)=0(46d) 



^1^2 | 

A 1 


where 

(17.8) f3' z = i~ p z if z = r + is . 

v — 1 

Here we could introduce the factor i~ because ri = r 2 (mod8) by virtue 
of (17.3). This factor will turn out to be quite natural. Since the condition 
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A ( 21 , 22 ) = 0 (mod 4bd) is equivalent to z\ = uiZ 2 (mod 4bd) for some rational 
lo (mod4M) we have 

(17.9) WW) = 2E^E/*(«) E Q 

b d cj(mod 46d) 

EE 

(zi,Z2) = l 

Zl = 0 JZ 2 (mod 46d) 

We reduce the summation to b < X and estimate the remaining sum trivially 
by 0(X~ 1 N 2 ). Then we remove the condition ( 21 , 22 ) = 1 at the cost of 
0(P~ 1 N 2 ). Next we mollify the logarithmic factor log2|2i22/A| as in (16.7) 
estimating the contribution from terms near the diagonal by 0(H~ l N 2 ) and 
we insert the truncated Fourier series (16.9) at the cost of 0(K~ 1 H 2 N 2 ). 
We detect the congruence 21 = luz 2 (mod 4bd) by characters of (Z[i]/4MZ[i])* 
getting 

(mm) W (D)=2Y,Hk)Y,^Y,Q^ E jmiaW 

b^X d x (mod4 bd) 

+ O ((A'- 1 + P- 1 + FT 1 + K~ l H 2 )N 2 log N) 

where 4>(M) is the order of the group (Z[i]/4MZ[i])* (see the formula (16.12)), 
J (x) is the character sum over rational classes 

(17-11) J(x) = E xM Q . 

uj (mod46d) 

and 

(17.12) Sj(/3')=E't^)6/W)‘- 

Z 

The formula (17.10) closely resembles (16.15), the only important difference 
being between the character sums (17.12) and (16.14) wherein (3 is turned 
into (3'. This extra spin of (3' will be vital in producing cancellation in 
Recall that [3 Z is given by (5.13) and it carries the support conditions 
2=1 (mod 2(1 + i)) and (2,11) = 1 (remember that (5.7) was detected by 
characters and (5.9) was waived in Section 10). We no longer need the angle 
modifier q{a) in [3 Z . Thus we replace it by the truncated Fourier series (16.27) 
which changes S^(f3') to 

£ mS k x + \P') + 0{L~ l N) . 
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Here the error term 0(L 1 N ) gives by (16.42) a contribution to W(/3) which 
is at most 0(KX 2 L~ 2 N 2 log X ). We choose H = X and K = L = X 3 getting 

(17.13) W{(5) « (log TV) E E E 

|fcR2AT 3 dsiX 2 x(mod4d) 

+ (X" 1 +P” 1 )N 2 log XN . 

Now we have 

(17.14) S*tf) = Y.' P*[4x(z)(z/\z\f 

(*, n)=i 

where restricts to primary numbers, the coefficients /3 Z being given by 

(17.15) / 3 z =p(n)n(n ) ^ ^(c) 

c\n, c^.C 

with n = zz and the symbol [z] being defined by 

(17.16) [z\ = i~ if z = r + is . 

We shall estimate every sum S k (/3') separately for each Grossencharacter 

(17.17) tp{z) = x(z)(z/\z\) k . 

This time the source of cancellation is the symbol [z\ which is attached to 
(3 Z , not the character ip(z) nor the Mobius function fi(\z\ 2 ). Therefore the 
conductor of ip plays a minor role. Our goal will be: 

Proposition 17.2. For every character (17.17) we have 

(17.18) S k x {(3’) < N 1 - 5 
uniformly in d(\k\ + 1) < N s for some positive constant 5. 

By Proposition 17.2 and (17.10) we at once derive 
Proposition 17.3. For 1 < X < N £ we have 

(17.19) W(P) < (P- 1 + X- 1 ) N 2 logN . 

18. Proof of main theorem 

First, assuming Proposition 17.2, we prove Proposition 10.2. Recall by 
(10.14) the decomposition T(/3) = U((3) + V(f3) + W(/3) in accordance with 
the size of the divisors of the determinant. These three parts were estimated 
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respectively in Sections 16, 15, 17. Inserting the results (15.8), (16.41), (17.19) 
we obtain 

T(/3) < (P" 1 + X~^ N 2 log N 

uniformly in X < (logN) u for any v with an implied constant depending on 
v. Choosing X = (logfV) 3cr+3 we complete the proof of Proposition 10.2. 

Assuming only Proposition 17.2 we have all the pieces finally in place for 
the proof of Theorem 1. Recall that this had already, in the early sections of 
the paper been reduced to the proof of the bound (4.23) for the bilinear form 
£>*(M, N) and to this end we need only the results up to Proposition 10.2. We 
re-trace the path we followed from (4.23) to Proposition 10.2 and the conditions 
imposed on the parameters A, A', B , r, and P. We required that 

(4.4) (log log x) 2 < log-P < (log x) (log log x)~ 2 

and 

(4.11) r ^ (logx) A+124 . 

From B*(M,N) we passed to B(M,N ) which introduced the additional 
restrictions 

(5.11) P^§A + A' 

and P ^ (log x) A+A ' , the latter following already from (4.4). 

From B(M. N ) we passed by Cauchy’s inequality to V(M, N) and then to 
V*(M, N), the latter step introducing the new restrictions 

(5.23) r < xH 
and 

(5.24) P ^ r 2 (log x )2A+4A'+508_ 

Next, from T>*(M, N ) we passed to Vq(M, N) encountering the additional 
requirement 

(9.12) r<( logx)^ B ~i A ~ 2A '~^ b 

where b depends only on r] in Proposition 4.1. Note that this condition super¬ 
sedes (5.23) and that, together with (4.4), it implies (5.24). 

Finally T>q{M,N) is given by (10.10) where the error term requires the 
conditions 

(10.11) r > (logic) 24 " 1 " 220 , 
which supersedes (4.11), and 

(10.12) A'>2A + 2 20 . 
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The main term in (10.10) introduces no new conditions due to Proposition 10.2 
and (4.4). 

It remains to show the consistency of this collection of conditions. To this 
end we may for example take A' = 2A + 2 20 and r = (log x) A+ ' 2 ' 2 °. If we take 
B sufficiently large we then ensure the remaining conditions (5.11) and (9.12). 
This completes the proof of Theorem 1. 

We still need to prove Proposition 17.2. We postpone this task to estab¬ 
lish first a larger background so as to be able to include a proof of the more 
general Theorem 2^’ stated in Section 26. This itself does not depend on the 
results already established. Some of the arguments of the proof have inde¬ 
pendent interest so we present these in considerable generality. The proof of 
Proposition 17.2 has essentially the same ingredients as that of Theorem 2^ 
plus a new combinatorial identity (see Proposition 24.2) which saves one third 
of the work by allowing us to make an appeal to Theorem 2^ itself. 


19. Real characters in the Gaussian domain 


Let q be an odd integer and lo (mod q) a root of 

(19.1) to 2 + 1 = 0 (mod q) . 

Therefore every prime factor of q is congruent to 1 (mod 4), so it splits in Z[i\. 
The number of roots lo (mod q) is equal to 2 4 , where t is the number of distinct 
prime factors of q. Given ui (modf?) we define a function on Z[i\ by 

(19.2) « 2) = (h±iT) if 2 — r + is . 

Clearly £ is periodic of period q. and it is multiplicative as well since 


(ri + cusi)(r 2 + ws 2 ) = rir 2 - sis 2 + ui(r\S 2 + r 2 si) (mod q) . 

Therefore £ is a character on Gaussian integers modulo q ; it is an extension of 
the Jacobi symbol on the rational integers 


(19.3) £(r) = 

When z is a unit we have 



if r£ Z . 


£(± 1 ) = 1 


£(±i) = 


(19.4) 
and 

(19.5) 


1 if q = 1 (mod 8) 

1 if q = 5 (mod 8) . 
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Thus £(z) is an even character and, if q = l(mod 8 ) then £(z) is a function on 
ideals. 

The roots of (19.1) correspond to the representations 
(19.6) q = u 2 + v 2 with (u,v) = 1, v even . 

These are given by 


(19.7) 

Since 

(19.8) 

we also have 

(19.9) 


ui = — vu (mod q ) . 




= 1 



if z = r + is . 


If we require w = u + iv to be primary and primitive then the correspondence 
(19.7) between the roots of (19.1) and the representations (19.6) is one-to-one 
and the characters (19.2) can be written as 


(19.10) 


*(*) = 


Re wz 




For q = | re | 2 prime these characters were considered by Dirichlet [Di]. We shall 
also denote (19.10) by 

(19.11) “ ' ' tZ ' 


a*) = t 1 ) 

\wJ 


and we call this the Dirichlet symbol even when w is not prime. No confusion 
should arise between the Jacobi symbol Xq( r ) = {q) aR d the Dirichlet symbol 
Cw(z) = (^) because the latter is defined and will be used exclusively for w 
which is primary and primitive (a rational integer q is not primary primitive 
unless q = 1 in which case the two symbols coincide). Note that 


= 0 


if and only if (w, z) / 1 . 


If both w and z are primary primitive then 


(19.12) 


Indeed, if wz is primitive then by (19.8) 


R ewz 


wz 


= 1 


(because the Jacobi symbol is multiplicative in the lower entry) whereas if not 
then ( w,z ) 7 ^ 1 in which case both sides of (19.12) vanish. 
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Applying the reciprocity formula (19.12) we deduce that for any z,w±,W 2 
primary primitive 



Note that W\W 2 is not necessarily primitive since W\,W 2 may have a nontrivial 
common factor. Let e be the primary associate of (w\,W 2 j. Pulling out the 
divisor d = ee we obtain the primary primitive number w\ W 2 /d. Then applying 
the reciprocity law (17.6) and (19.12) we infer that 


(19.13) 


w 1 


W 2 


d 


W\W2/d 


This is true for all z, as seen by reducing to the case of z primary primitive. 
In particular, taking w\ = W 2 = w primary primitive so that d = (w,w) = 1, 
the formula (19.13) gives 


(19.14) £«;£«) = X q ° N 

where q = ww, Xq is the Jacobi symbol and N is the norm map. Now using 

(19.14) for q = d we can write (19.13) as 


(19.15) 


£uil£u>2 


Uei 


wiW2/ee 


where e is the primary associate of (w \, W 2 ) - This is the multiplicativity law 
in the lower entry for the Dirichlet symbol. 

Having £(z) = £ w (z) as a function in z G Z[i\ one can define a character 
on odd ideals a C Z[i] by setting £(a) = £(z) if a = (z) with z primary. Note 
that if w = 1 (mod 4) then q = ww = 1 (mod 8), and the choice of the primary 
generator is not necessary. With £ we associate the L-function 


(19.16) L(s, 0=J2^ a )( Na )~ S ■ 

a odd 


Since £ is completely multiplicative we have the Euler product 

£(*,«) = II (i-«p)(ivpr ') _1 

p odd 

= n i 1 - A (p)P~ S + Xq{.v)p~ 2s y l 

p> 2 


where 



£(vr) + £(tt) 
0 


if p = 7T7T, 7 r primary, 
if p = — 1 (mod 4) . 


If the character £ is nontrivial then the corresponding theta series 


9(z,€) = ^2 £(a )e(zNa) 

a odd 
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is a cusp form of weight one, level 4 q and character \4Xg- Hence L(s,£) is an 
entire function. Moreover if £ is primitive we have the functional equation 

(19.17) (^0 r (s)L(s,Z) = w(^p) r(l - s)L(l - S,£) 

where W is the normalized Gauss sum. We shall not use any property of L(s, £) 
in this paper although doing so would strengthen some of the results. 


20. Jacobi-Kubota symbol 


In Section 17 we arrived at the sums of (3 Z twisted by Grossencharacters 
ij>(z) and the symbol [z] over z primary and primitive. However, it makes sense 
to define 

( 20 . 1 ) 

for all z = r + is = 1 (mod 2). Note that [z] vanishes if z is not primitive. Here 

r — 1 

the factor i~ is attached since it simplifies forthcoming relations. One could 
further refine [z\ by including the Hilbert symbol 

— 1 if r, s < 0 
1 otherwise, 

which amounts to extending the Jacobi symbol to all odd moduli (positive or 
negative) by setting 

(2 °' 2) (r) = (7l) <r ’ S, “ ' 

However, being unable to take advantage of such a refinement, in this paper 
we stay with (20.1). 

We shall be interested in the multiplicative structure of [z] where, “mul¬ 
tiplicative” refers to that within the Gaussian ring rather than rational multi¬ 
plication relative to the coordinates r, s. For this reason we refer to [z] as the 
Jacobi-Kubota symbol rather than simply the Jacobi symbol. 

Of course, [z] is not multiplicative per se, yet it is nearly so, up to a factor 
which is the Dirichlet symbol. 




Lemma 20.1. If w is primary primitive and z = 1 (mod 2) then 

(20.3) [wz] = s[w][z] (^) 

\wl 

with e = ±1 depending only on the quadrants in which w,z,wz are located. 
Precisely , if w = u + iv and z = r + is then (20.3) reads as 


us + vr 
ur — vs 


u— 1 

= si 2 




ur — vs 


(20.4) 


a — 1 

i 2 


u 2 + v 2 
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where a = Reioz = ur — vs and e = e(w, z ) is given by 

(20.5) £ = (u,v)oo(r, -v)oo if ur > vs , 

(20.6) e = (u,v)oo(-r,v)oo if ur < vs . 


Remarks. The formula (20.3) is reminiscent of a similar rule for the theta- 
multiplier (cf. [Sh]). The presence of the Dirichlet symbol in (20.3) destroys 
the multiplicativity but this is just what enables successful estimation of the 
relevant bilinear forms (see the next three sections). 

Proof. We give a direct proof of (20.3) using the quadratic reciprocity law 

(17.6) together with its supplement 


(20.7) 



if 2 | d. 


We also need the following easy properties of the Hilbert symbol 


(20.8) 

(■ x,y ) = (y,x) 

(20.9) 

Ts 

TT 

II 

Ts 

H 

(20.10) 

{x, -y) = (x,y) sign x 

(20.11) 

(~x, ~y) = ~ (x, y) sign xy 


Put wz = ur — vs + i(us + vr ) = a + ib, say. First, we consider the case 
a > 0. We can assume that (u,v) = (r, s) = 1, or else both sides of (20.4) 
vanish. Put v = pv\ and r = pr\ with (wi,ri) = 1. Then 




because (^—^= 1. Since 0 < u 2 + v 2 = 1 (mod4) we have 


u 2 + v 2 


u 2 + v 2 


by the reciprocity law, giving the last factor in (20.4). Therefore 
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and we need to compute the factors on the right. First, by the reciprocity 
(17.6) we find that 


n 


ur i — v\s 


= ("!)■ 


r i —1 ur i —1 


Ini 


uri-v\s\ ( v\s 

= (signr)(—1) ^ 2 -—- 


N 


Similarly, writing v\ = 2 a v' with v' odd, we compute by twice applying reci¬ 
procity that 


Vl 


ur i — v\ s 


. „ . v'-l ur t -1 / uri 

= (— 1 ) 2 2 


= (ur, v) c 


ur i — wis 
2 " 


|rt?T| / \ur i — vis. 

In the last symbol we change v' back to v\ and we evaluate the resulting Jacobi 
symbol for 2“ by means of (20.7) getting 

2 " 


( ra ) = <-'> T 


v ur i — vi s y 

because (utt) 2 (r?t — vis) 2 — 1 = 2v\s = 2vs (mod 16). Note also that we have 
(signr)(ur, v)oo = (signr)(r, v)oo(«, v)oo = (r, -v)oo(«, v)^ = e. Gathering 
these results we obtain 

f) = ) GiV-A) (l 

a) V P J VM/ \\ri\J \w 


because the symbol j appears twice and therefore annihilates itself. Now, 
again by the reciprocity (17.6) we transform 


us \ ( V\ 
P ) VM 


s 

In I 


V 


PJ VM/ VM/ V|r 

, ( V \ / S 

= ( - i; ’ 2 M 


Hence 


= e(-iy 


where the parity of v is given by 

r\ — 1 u — 1 p — lu — 1 vs _ fpr — 1 p — 1 \ it — 1 vs 


v 


1 2 
r + 1 


+ 


2 2 + 4 


+ 


P- 


- 1 


2 2 
u — 1 vs _ r — lu — 1 vs 

2 ^ X = 2 2 T 


2 + T 


r — lu- 1 vs u — 1 r — 1 a — 1 
= 2 2 ^ ~4 ~ 4 1 4 4 

This completes the proof of (20.4) in the case a = ur — vs > 0. 
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If a < 0 we apply (20.4) with (r, s)oo changed to (— r, — s)oo- The left side 
changes by the factor 



while the right side changes by the factor (see (20.10), (20.11)) 

i~ r (yj') ( r ,-v) 00 (-r,-v) 00 =i(r,-v) 00 (-r,v) 00 . 

These changes lead to (20.4) with e switched from (20.5) to (20.6). This 
completes the proof of Lemma 20.1. □ 

We shall have a problem to separate w,z in s(w,z). In this connection 
one can use the formula 

(20.12) 2(x,y) 00 = 1 + sign x + sign y — sign xy 
to write 

(20.13) 2(r, —v)oo = 1 + sign nr + signr — signu 

(20.14) 2(—r, v) OD = 1 + sign vr — sign r + sign v . 

Hence one can write both cases (20.5), (20.6) in a single form: 

(20.15) 2e{w,z')(u,v) 0O = 1 + sign(nr) — (signu — sign r) sign(Re wz) . 

Remarks. If w, z are primary primitive we know by (20.3) and (19.12) that 
e(w,z) is symmetric, however this property is not so apparent from (20.15). 

The formula (20.15) reduces the problem of the separation of variables 
in e(w,z) to that in sign(Rewz), and the latter requires an application of 
harmonic analysis. One can proceed in a number of ways. We shall use the 
characters e lkw & z since these fit well the analysis previously employed. Note 
that sign(Re 2 :) = 1 is equivalent to | argz| < whence 

(20.16) sign(Rercz) = t(axgwz) 

where t(a) is the periodic function of period 27T which is even, takes value 1 if 
0 < a < and — 1 if ^ < a < ir. Since wz is not purely imaginary because 
of wz = 1 (mod 2), we can smooth t(a) slightly at a = \ without altering the 
values (20.16) but getting the Fourier expansion 

(20.17) t(a) = ^t(k)e ika 

k 

which converges absolutely and has 0-norm of its coefficients almost bounded. 
Precisely, if w, z are restricted by \wz\ < R and wz = l(mod2) then wz stays 
away from the imaginary axis by an angle R~ 1 and this is sufficient room for 
the modification of t(a) so that 
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(20.18) ^|i(fc)|«log2.R. 

k 

By (20.16) and (20.17) we write 

(20.19) sign(Re tuz) = ^ t(k) ( 7 — 7 )^' • 

k 

Inserting (20.19) into (20.15) we obtain an expression for e(w,z ) in which the 
variables w, z appear separately. 


21. Bilinear forms in Dirichlet symbols 


In this section we establish (among other things) estimates for general 
sums of type 

( 21 . 1 ) Q(M,N) = ^2*J2^Pz(^) 

w z 

where a w , /3 Z are complex coefficients supported in the discs 

(21.2) \w\ 2 sj M 

(21.3) M 2 s£jV 

and the * restricts the summation over w to the primary primitive numbers. 
To simplify the arguments we assume, as we may, that 

(21.4) \a w | < 1 

(21.5) \0 Z \ < 1 . 

Our aim is to improve on the trivial bound 

(21.6) Q(M, N) < MN . 


For the sake of simplicity we do not attempt to show the strongest results. The 
method is robust and it deserves a more precise consideration in a separate 
project. In particular one could employ the theory of the L-function (19.16) 
but we choose direct arguments which are sufficiently powerful. We begin by 
the following 


Lemma 21.1. Let w±,iU 2 be primary primitive. Put q = \w±W 2 \ 2 and 
d = | (w\, 1 ^ 2 )| 2 ; so d 2 | q. Then we have 


(21.7) 


C (mod q) 


E (-)(- 


W 1 


IV 2 


q<p(d)ip(q/d) 

0 


if q and d are squares 
otherwise. 
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Proof. By (19.13) we have 


W\ 

Hence the sum is equal to 


EE 

r,s (mod q) 


r z + S* 
d~ 


c_ 

W 2 


r + uis 


10! 

d 


c 


w\w 2 /d 


EE 

r,s (mod q) 


r — tvs 


d 


r + ujs 
q/d 


for some J 2 + 1 = 0 (mod q). We change the variables r, s into r — ws = x and 
r + cos = y getting 


E 

x (mod q) 


E 

y (mod q) 


* - 


E 


x (modd) 


E 

y (mod q/d) 




□ 


unless d and q/d are squares in which case the sum equals to qLp(d)(p(q/d), 
completing the proof of (21.7). 

Lemma 21.2. We have 

(21.8) Q(M, N ) < |m 3 1V3 + M 2 N~i + M 3Jvj (. MNf. 

Proof. By Cauchy’s inequality 


\Q(M,N)f 


€ 


H/?II 2 EIE^ 

2 : w 

im 2 E'E’ 




(%vCX 


W\'~ X W2 


E - - 

' \ HU / \ 7/Jo 


W i 


W 2 


W 1 W 2 

Here and below z runs over Gaussian integers in the disc (21.3). Splitting the 
inner summation into residue classes f (rnodg) with q = \w\w 2 \ 2 we get by 
elementary counting that 

'y/N 


E 


WlJ \w 2 


— 1 ( — \ = 


C (mod q) 
Hence by Lemma 21.1 we get 


T, (-)(-)i^ + o 

* V HU / V 




c 


rc 2 


vrlV 


q- 


Q 


+ 1 


Q(M, IV) 2 < N 2 EE r(mim 2 ) + NM 4 (Vn + M 2 ) 


m\ ,777-2 

Tn\TYl2 —□ 


which yields (21.8). 


□ 


The estimate (21.8) is trivial if N < M 4 . We shall refine this by exploiting 
the multiplicativity of £ w (z) in z. Applying Holder’s inequality we get 


Q k (M,N)<^M k ~ 1 J2* I E& 


= M k ~ 1 Q(M, N*) 
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where Q is the bilinear form of type (21.1) with coefficients 

Pz = ^2 Pzi ■ ■ ■ Pz k 

Zl...Z k = Z 

and some a w with \a w \ = \ot w \. Since $ z <C N £ we get by applying (21.8) that 
Q k (M, N) < M k ~ l {m 3 1V§ + M 2 ivf + Af^lV fc | (MNf , 

whence 

Q(M, N) < + M 1+ InI + M 1 - 3je!v} (MNf . 

This holds for any integer k > 1 and any e > 0 with the implied constant 
depending only on k and e. Since k can be arbitrarily large we have already 
established a nontrivial bound for Q(M. N) whenever M and N have the same 
order of magnitude in the logarithmic scale. The above result would be suffi¬ 
cient for a proof of Theorem 1 but not of Theorem 2. However, it is possible 
to do better by exploiting a symmetry of Q(M, N). With this aim in mind we 
choose k = 6, getting 

Corollary. We have 

(21.9) Q{M, N) < |M5JV5 + M5JV3 + M^iv} (MNf 

where the implied constant depends only on e. 

Still (21.9) is trivial if N < Ml Now we refine (21.9) by exploiting the 
reciprocity law (19.12). This requires both w and z to be primary primitive. 
Let Q*(M,N ) denote the form (21.1) restricted to primary primitive w and 
z. Of course, (21.9) holds for Q*(M, N). Interchanging w with z we switch 
M with N in (21.9). Then taking the minimum of the two bounds (21.9) we 
deduce that 

Q*(M,N ) < (MMis +NM is) (MNf . 

This estimate extends to Q(M,N) through the following arrangement: 

q(m,n) = ££X“«A~-(ts) GO =£<m.JW 2 ) • 

say, where Q*(M,Nd ~ 2 ) has coefficients depending on d and bounded by 1. 
Hence we conclude 

Proposition 21.3. For any complex coefficients a w ,/3 z supported in 
the discs (21.2), (21.3) and satisfying (21.4), (21.5) respectively we have 

(21.10) Q{M,N) < (M + N)T2(MN)^ +£ 
where the implied constant depends only on e. 
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Note that the bound (21.10) is nontrivial whenever M and N have the 
same order of magnitude in the logarithmic scale. 

Our actual goal is to estimate bilinear forms in the Jacobi-Kubota symbol 
[wz\ (see (20.1)) rather than in the Dirichlet symbol (f-) (see (19.10)). How¬ 
ever, by virtue of the multiplier rule (20.3) together with the formula (20.15) 
and the Fourier series (20.19) these problems are essentially equivalent as long 
as the coefficients a w ,(3 z are arbitrary but bounded. We denote 

(21.11) K{M,N) = EE a w fi z [wz] 

W Z 

where a w , /3 Z are complex coefficients supported on primary numbers in the 
discs (21.2), (21.3) respectively. We shall also need the restricted bilinear form 

(21.12) JC*(M, N ) = a M wz \ • 

(w,z)=l 

Proposition 21.4. For any complex coefficients a w ,(3 z supported on 
primary numbers in the discs (21.2), (21.3) and satisfying (21.4), (21.5) respec¬ 
tively we have 

(21.13) K.{M,N) < (Af + N)^(MN)T5 +£ 

where the implied constant depends only on e. The same result holds true for 
the restricted bilinear form 

Proof. By Proposition 21.3 (without loss of generality one can restrict 
w and z to primitive numbers because otherwise [wz\ vanishes) we at once 
obtain (21.13). It remains to derive the same bound for N). Detecting 

the condition (w, z) = 1 by Mobius inversion we obtain 

1C* (M, N) = '^2f_i(e)'^2'^2a ew p ez [e 2 wz\ . 

e w z 

Here we can assume that e 2 is primitive since otherwise [e 2 ivz] vanishes. Then 
by (20.3) we write 



where e = ±1 depends only on the quadrants in which e 2 wz , e 2 , wz are 
located. For each e we can separate the variables w, z in the e-factor by 
applying (20.15) and (20.19) which will cost us logIV by virtue of (20.18). 
In this way we obtain bilinear forms of type JC(M\e\~ 2 , N\e\~ 2 ). These are 

0 23 1 11 \ 

e| -_ 6"(M + N) 12 (MN) i 2 +£ J due to (21.13). Summing this 
bound over e we conclude that (21.13) holds for IC*(M,N). □ 
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22. Linear forms in Jacobi-Kubota symbols 

In addition to the general bilinear forms IC(M, N) and 1C* (M, N) we shall 
need bounds for the special linear forms 

(22.1) JC(N) = ^2 ip(z)[wz] 

ze<s 

(22.2) JC*(N) = il>{z)[wz] . 

ze <b 
(z t w )=1 

where a restricts to the primary numbers, 03 is a polar box contained in the 
disc \z\ 2 < IV, w is a fixed primary number and ^ is a Hecke character given 
by (17.17), namely 

H z ) = x(z)(z/\z\) k . 

Here y is a character on residue classes to modulus 4 d and k is a rational 
integer. 

Note the trivial bound 1C(N) <C N. We seek a bound which is nontrivial 
uniformly in large ranges of the relevant parameters d, \k\, |tc|. We succeeded 
to establish 

Proposition 22.1. Given ip and w as above we have 

(22.3) 1C(N) < d(|fc| + l)Mlvi log \w\N, 

(22.4) JC*(N) < d(\k\ + 1)\w\t(\w\ 2 )N^ log \w\N . 
where the implied constant is absolute. 

Proof. We use the Poly a-Vinogradov theorem which asserts that 

(22.5) ^ x(n) < Hq iogg 

n^N 

for any nontrivial Dirichlet character y (rnodg) where the implied constant is 
absolute. 

For the proof of (22.3) we can assume that w is primitive or else [wz\ 
vanishes for all z, hence so does IC(N). By Lemma 20.1 we write 

(22.6) JC(N) = e[iu]'22 A ip (z)[z]f, w (z) . 

ze® 


Here we have pulled out the factor e = ±1 which is permissible since it can 
be made constant by splitting 03 into sixteen sectors (if necessary) to keep the 
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arguments of z and wz in fixed quadrants. Write the real character as (see 
(19.2)) 

(22.7) €w(z) = if z = r + is 

where q = \w\ 2 and to 2 + 1 = 0 (mod q). Inserting (17.17), (20.1) and (22.7) in 
(22.6) we get 




r + ujs 

q 


Observe that 


r+ms 

Q 


) = (¥ 


. For given r we translate s by lot getting 


|/C(1V)| < 

\r\<VW 

r=l(mod 2) 


s£l(r) 


LOT 


— ir) ( 


s + uir — ir 
s + ujr — ir 


k 



where I(r) is a segment having length < 2 \fN of integers in the progression 
s = r — 1 — u;r(mod4). The inner sum satisfies 

^ < d(\k\ + V)\/ q\r\ log q\r\ 

s£l(r) 

by the Poly a-Vinogradov estimate (22.5) provided that q\r\/(d,q\r\) is not a 
square. Here the factor d is lost by splitting s into residue classes modulo 
4ri which is necessary to fix the values of whereas the factor |/c| + 1 is lost 
from an application of partial summation which is needed to remove the sector 
character. The condition that q\r\/(d,q\r\) is not a square ensures that the 
relevant Jacobi symbol is not the trivial character. If it is a square we simply 
use the trivial bound 2 y/N. Summing these bounds over r we obtain (22.3). □ 


Now we apply (22.3) to estimate the reduced linear form 1C* ( N ). Detecting 
the condition (z, w) = 1 by Mobius inversion we get 

£*(") = M(e)^(e) r ip(z)[ewz] . 

e\w z^'S/e 

Hence we obtain (22.4). 


Remarks. The bound (22.3) improves the trivial bound KL{N) -C N if 
(22.8) d(\k\ + 1)M < m(\ogN)- 1 . 

One can improve the range of uniformity (22.8) considerably by applying the 
well known estimate of Burgess [Bu] in place of (22.5), but we do not need 
such a strong result here (we did use Burgess’ estimate in an earlier version of 
this work). 
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23. Linear and bilinear forms in quadratic eigenvalues 

To the Hecke character (17.17) we associate the “quadratic eigenvalues” 

(23.1) \(n) = ip(z)[z\ 

zz=n 

where, as before, a restricts the summation to primary numbers. Writing 
z = r + is this becomes 

(23.2) \(n) = ^2 i~ip{r + is) 

r 2_|_ s 2— n 

where naturally enough, in this sum a restricts to pairs of integers satisfying 
(5.4) and (5.5). Note that A(n) vanishes if n has prime factors other than 
p = 1 (mod 4). We suspect, but have not examined thoroughly, that A(n) are 
related to the Fourier coefficients of some kind of metaplectic Eisenstein series 
or a cusp form, by analogy to the Hecke eigenvalues (16.30) which generate a 
modular form of integral weight. In the quadratic eigenvalue (23.1) we focus 
on the symbol [z], yet the presence of the Hecke character offers welcome 
flexibility by means of which one can create simpler objects such as 

(23.3) A 0 (n)= ]T g) 

r 2_|_ s 2— n 

where r, s are both positive and r is odd. Here one can also put r, s into a 
prescribed sector and one can require these to be in fixed residue classes to a 
given modulus. Theorem 2 concerns the eigenvalue (23.1) stripped to (23.3). 
By analogy to JC(M,N ) we construct the bilinear form 

(23.4) £(M,N) = EE a(m)(3{n)\(cmn ) 

m n 

where a(m), f3(n) are complex coefficients with 

(23.5) \a{m)\ ^ 1 for 1 < m < M 

(23.6) \P( n )\ < 1 for 1 < n < N 

and c is a positive integer. Have in mind that A is not multiplicative so the 
introduction of c makes the sum (23.4) more general and in practice this offers 
some extra flexibility. We shall also consider the restricted bilinear form 

(23.7) C* (M, N) = EE a(m)/3(n)\(cmn) . 

(m,n)=1 



From Proposition 21.4 we derive 
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Proposition 23.1. For any complex coefficients a(m),/3(n) satisfying 
(23.5), (23.6) respectively, and for any c ^ 1 we have 

(23.8) £(M, N ) < r(c)(M + IV) A (MN)^ +£ 

where the implied constant depends only on e. The same bound holds true for 
C*(M, N ). 

Proof. First we establish (23.8) for the reduced form C*(M,N) with the 
coefficients a(m),/3(n ) supported on numbers prime to c. Since c,m,n are 
mutually co-prime we have 

A (cm.n) = £' >(e) r , f’(z)[ewz\ . 

ee=c ww=m zz=n 


Hence 

E'r ct w /3 z ijj(wz)[ewz\ 

ee=c (ww,zz)=l 

where a w = a(ww ) and (3 Z = (3{zz ). We can assume that ewz is primitive or 
else the symbol [ewz] vanishes. Thus e is primitive and ( wz,Wz ) = 1 in which 
case the condition ( ww,zz ) = 1 reduces to (w,z) = 1. Applying (20.3) we get 

£*(M, N) = ^2 V>(e)[e]^ XT £OL w p z if(wz) (^) [wz] 

ee=c ( id ,^)=1 

where e = ±1 depends only on the quadrants in which e,wz, and ewz are 
located. For each e we can separate the variables w, z in the e-factor by the 
use of (20.15) and (20.19). Moreover we write 

if(wz) = ip(w) Q ip(z) Q 

getting bilinear forms of type JC*(M,N) with their coefficients a w ,/3 z twisted 
by characters. These forms satisfy the bound (21.13). Summing over e we 
conclude that C*(M,N) satisfies (23.8). Now we can remove the condition 
that (c, mn ) = 1 as follows 

C*(M,N ) = EE EE a(am)P(bn)X(abcmn ) 

ab\c°° (m,n )=1 
(a,b )=1 (mn,c )=1 

< r ( a6c ) i a ~ lM + b ~ lN ) ^ {(aby'MN) 

ab\c°° 

(a,b)=1 

< r(c)(M + N)^(MN)^ +£ . 
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Finally we remove the condition (m, n) = 1 as follows 

JC(M,N) = EEE a(dm)(3(dn)\{cd 2 mn) 

d (m,n)=1 

< ^T(cd 2 )d~^{M + N)T2{MN)^ +£ 

d 

< t(c){M + N)T2(MN)£ +£ . 

This completes the proof of Proposition 23.1. □ 

By analogy to IC(N) and IC*(N) we construct the special linear forms 

(23.9) C(N) = A ( mn ) > 

n^N 

(23.10) C*(N) = J2 A ( mn ) • 

n<7V 
(n,m)=1 

Proposition 23.2. Given a Hecke character ip defined by (17.17) and 
a positive integer m we have the bounds 

(23.11) £(]V) <C d(\k\ + l)T(m) A y/rnN4 log mlV , 

(23.12) C*(N) <C d(\k\ + l)r(m) 2 y/mN* log mN , 
where the implied constant is absolute. 

Proof. We have 

C(N) = XT 1 p(w) ^2 1 p{z)[wz] 

ww=m zz^N 

(z,w )=1 

so (23.12) follows from (22.4). Next we have 



e\m c|e°° n^N/c 
(n,m)=1 


so (23.11) follows from (23.12). □ 


24. Combinatorial identities for sums of arithmetic functions 

Given a nice arithmetic function / : N -> C we are interested in estimating 
sums of f(n) twisted by the quadratic eigenvalue A (n). Since A(n) changes sign 
at random, one should expect a lot of cancellation so that a bound 

(24.1) "Y2 /(n)A(n) < x 1_<5 

n^Cx 
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for some S > 0 is not out of the question for many interesting functions normal¬ 
ized as to satisfy f(n) <C n £ . In this section we develop some combinatorial 
identities by means of which one can reduce this problem to estimates for 
general bilinear forms C* (M, N ) and for special linear forms over primes. 

Fix r ^ 2. For every squarefree number, say 

£ = p\P 2 • • •, with pi > p 2 > ... , 

we define the divisor 0 = o(£) by setting 

(24.2) o =Pl ■ ■ ■ PrP2rP:ir ■■■ ■ 

Here and throughout this section the product of primes terminates when it 
runs out of primes of l. As usual the empty product is defined to be one. Note 
that 

(24.3) 0 < e 1 / r p\- 1 . 

We shall call 0 = d(£) the “separation” divisor of £ for reasons soon to be clear. 
Then we define the divisors m = m(£), n = n(£) by setting 

(24.4) rn = {p r +l . . . P2r-l){P3r+l ■ ■ -P4r-l) ■ ■ ■ , 

(24.5) n = (P2r+1 ■ ■ -P3r-l)(P4r+l • • ■ Pbr-l) ■ ■ ■ ■ 

Note that 

(24.6) n < m < Dn . 

One can characterize m, n solely in terms of the separation divisor. Indeed, 
writing 

0 = 7T17T2 . . . , with 7Ti > 7T2 > . . . , 

that is 7 Tj = pj if j ^ r and TT^+r-i = Pkr if k ^ 1, we see that m has the 

following properties: 

(o + ) the first largest r — 1 prime divisors of m are in (7iy + i,7r r ), 

the second largest r — 1 prime divisors of m are in (7r r+ 3, x r+ 2 ), 
the third ..., etc. 

Similarly n has the following properties: 

(d - ) the first largest r — 1 prime divisors of n are in (7iy + 2,7iy+i), 
the second largest r — 1 prime divisors of n are in (7r r+ 4,7r r+ 3), 
the third ..., etc. 

Hence we obtain the factorization 

(24.7) £ = dmn . 

and every squarefree £ has unique factorization of this type. By (24.6) and 

(24.7) it follows that 
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Suppose £ < x and that £ is free of prime divisors larger than 

(24.9) * = x 1/r2 . 

Then it follows by (24.3) that the separation divisor is quite small, namely 
0 < D with 

(24.10) D = x 2/r . 

By the above combinatorics we arrive at 

Lemma 24.1. Let f(£) be an arithmetic function supported on squarefree 
numbers £ < x. Let r > 2. Then we have 

(24.11) ^ f( £ ) = ( n )f(^ mn ) 

t\P{z) ni,n^^/x 

where 7 ^, % are the characteristic functions of those integers having the prop¬ 
erties (o + ), (d - ) respectively, and z, D are defined by (24.9), (24.10). 

In this lemma P(z ) denotes the product of all primes < z. To remove the 
restriction £\ P(z) we appeal to another combinatorial formula 

(24.12) £/«)= £ /«-) + ££ f{pq)v(pq,z) 1 

l l\P(z) p>z q 

where v[£, z) denotes the number of prime factors of £ which are > z. Note 
here that q runs over integers. We can write v{jpq,z)~ l = 7 (q) since it does 
not depend onp. If q < z then 7 (q) = 1. We single out this part of (24.12) 
and insert (24.11) getting 

Proposition 24.2. Let f(£) be a function supported on squarefree 
numbers £ < x. Then we have 

(24.13) Y f( £ ) = ^ ( m )% ( n)f(Zmn ) 

+ 7(?)/(P9) + YjYI /(P?) 

p,q>z p>z^q 

for any r ^ 2 where 7 ^, 7 ^ are the characteristic functions of the integers 
having the properties (f + ), (D~) respectively, 7 (q) = (1 + v(q, z)) -1 , and z, D 
are defined by (24.9), (24.10). 

Remarks. On the right-hand side of (24.13) we have double sums over 
m,n < \fx and p,q > z each of which is a bilinear form in /. The last sum in 

(24.13) will be treated for each q individually (since q is relatively small) as a 

special linear form over primes f(pq)- 

v 
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25. Estimation of S^(/3') 

In Section 18 we have completed the proof of Theorem 1 by an appeal to 
Proposition 17.2 which is yet to be established. In this section we reduce the 
proof of Proposition 17.2 to Theorem 2, actually in a somewhat more general 
form, and to the latter we devote the next (and last!) section. 

Recalling (17.14), (17.15) and (23.1) we write 

(25.1) S^((3') = Y ff(m)/i(m)7(m)A(m) 

(?n,n)=l 


where g(m) is a smooth function supported on N' < m < (1 + 9)N' with 
N < N' < 2N and satisfying <C ( 9N)~ J for j = 0,1, 2. In fact g(m ) is the 
function p(n ) in (4.13) but we have changed the notation to avoid ambiguity 
with the variable in sums over primes. We put 


(25.2) 7M = Y ^c) 

c\m , c^.C 

with 1 ^ C < for a small g > 0. Therefore, changing the order of 

summation, 

s k x (P') = Y^ Y mg(c£)x(ce ). 

c^C (i,cU)=l 

(c,n)=i 

First we can assume that 

(25.3) 1 < C < N v 

because the remaining partial sum of S^(/3') over m = cl with N 71 < c < IV 1- ^ 
is a bilinear form of type C*(A,B) with A, B < N 1 ~ r> , AB < 2N for which 
Proposition 23.1 gives the bound (after splitting into dyadic boxes, normalizing 
the coefficients, and separating the variables in g(c£) via the Mellin transform) 

(25.4) C*(A,B) < jV 1- i5+ £ , 
and this is sufficient for (17.18). 

Now we decompose the inner sum over l according to the formula (24.13) 
with x = N getting 
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where T(c,d) and T(c ) are bilinear forms 


T(c,D ) = 

^22^2 7^"(m)7 a (n)n(mn)g(cmn)\(cmn 


m,rL^.y/N 


(mn,cU)=l 

T(c) - 

l(q)lJ-(pq)9(cpq)X{cpq) 


p,q>N 1 / r2 


and 5(c) involves a long sum over primes with a small parameter, namely it is 
given by 

s(c) = EE g{cpq) X(cpq). 

The first double sum is a bilinear form of type (23.7) for which, after 
separating the variables m,n in g(cmn), Proposition 23.1 gives 

23 

T( C ,0) < N24 +£ . 

Similarly for the second double sum Proposition 23.1 also gives 

T( C ) < jVl-l/12r 2 + e . 

The inner sum over primes in 5(c) may be estimated using Theorem 2^, to be 
proved in the final section. This gives 

5(c) < ^2 cd(\k\ + l)qN^ <C cd(\k\ + 1)N'^ + P I 

g^ATi/r 2 

uniformly in c,d,\k\, the parameters of the Grossencharacter involved in A. 
Adding these bounds, summing over c, and choosing r sufficiently large we 
obtain Proposition 17.2. 

It remains to prove Theorem 2^, a task we have postponed to the final 
section since it may generate an interest on its own. 

26. Sums of quadratic eigenvalues at primes 

Recall that 

(26.1) \(n) = Y2^(z)[z] 

zz=n 

where ip(z) is the Grossencharacter (17.17) and [z] is the Jacobi-Kubota sym¬ 
bol. We shall establish 

Theorem 2^. For any c ^ 1 we have 

(26.2) y] A(n)A(cn) <C cf.r^ 

n^.x 

where f = d(\k\ + 1) and the implied constant is absolute. 
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We begin by stating a result, see [DFI, Lemma 9] where also the proof may 
be found, which is useful for separating integral variables m, n constrained by 
an inequality mn < x. 


Lemma 26.1. For x > 1 there exists a function h(t ) such that 



\h(t)\dt < log 6 x 


and for every positive integer k 



h{t)k lt dt 


11 if k < x 
10 otherwise. 


For the proof of Theorem 2^ we split the sum according to the formula of 
R.C. Vaughan 

(26.3) A(n) = Y F(a) log ^ - YU ^ a ) A ( 6 ) + h{a)A(b) • 

a\n ab\n ab\n 

a ^V a, b^.y a,b>y 

This identity is valid for n > y. If n < y then the right side vanishes. Hence 
the sum (26.2) splits into So + Si — S 2 + S 3 where 

50 = y>(n)A(cn) < r(c)y , 

n^y 

51 = Yn{a) E A (acm) log m , 

a ^y m^x/a 

* = EE y(a)A(b) E A (abcm) , 

a, b^y m^x/ab 

S 3 = ^X^(a)A(6) X(abcm ) . 

ab^.x m^x/ab 

a,b>y 

In Si we first insert logm = f™t~ 1 dt to avoid partial summation and, only 
then, apply Proposition 23.2 to infer that 

Si <C c^f F(ac) 4 a - 2 xi(log cx) 2 <c cfyix^ +£ . 
a^y 

For S 2 we directly apply Proposition 23.2, this time getting 

5 2 <C cfyix3 +£ . 

By Proposition 23.1 we infer that (after splitting the variables into dyadic 
segments and separating them by the aid of Lemma 26.1) 

5 3 < cy~ i 2 x 1+e . 
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We sum up the above bounds and choose y = x « getting cfx^ +£ which is 
slightly better than (26.2). 

Remarks. Theorem 2^ should be compared with the bound in [DI] for 
the corresponding sum over primes of the Fourier coefficients of a cusp form 
with respect to the theta multiplier. However the arguments of [DI] are rather 
different from these used here. Of course, we have not made an attempt to 
get the best bound by the available technology. For example if we had treated 
the part of S -2 corresponding to ab < y as a sum of type Si and the remaining 
part as a sum of type S 3 the choice y = x 3//10 would give (26.2) with exponent 
|| + e. We challenge the reader (if she/he is not yet burnt out of energy as 
we are) to further reduce this exponent to a single digits fraction. It could be 
also interesting to improve the dependence of the bound (26.2) on the involved 
parameters c and f or simply to get the nontrivial bound 

^ A(n)A(cn) <C x 1 ^ 6 

n^. x 

but uniformly in cf as large as possible. 
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